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ABSTRACT 



O 

Aims. Pointed observations with XMM-Newton provide the basis for creating catalogues of X-ray sources detected serendipitously 

OO , in each field. This paper describes the creation and characteristics of the 2XMM catalogue. 

Methods. The 2XMM catalogue has been compiled from a new processing of the XMM-Newton EPIC camera data. The main features 

of the processing pipeline are described in detail. 

Results. The catalogue, the largest ever made at X-ray wavelengths, contains 246,897 detections drawn from 3491 public XMM- 

>^> , Newton observations over a 7-year interval, which relate to 191,870 unique sources. The catalogue fields cover a sky area of more 

than 500 deg 2 . The non-overlapping sky area is ~ 360 deg 2 (~ 1 % of the sky) as many regions of the sky are observed more than once 
by XMM-Newton. The catalogue probes a large sky area at the flux limit where the bulk of the objects that contribute to the X-ray 
background lie and provides a major resource for generating large, well-defined X-ray selected source samples, studying the X-ray 
source population and identifying rare object types. The main characteristics of the catalogue are presented, including its photometric 
and astrometric properties 

Key words, catalogues - surveys - X-rays general 

1 . Introduction forded by typical X-ray instrumentation. Such surveys have been 

,,,.,, , . pursued with most X-ray astronomy satellites since the Einstein 

Surveys play a key role in X-ray astronomy, as they do in other observat The resulti ser endipitous source catalogues (e.g., 

wavebands providing the basic observational data characters- EMSS . Gioia et al 199Q> StQcke et d mi . WG ACAT. White 

ing the underlying source populations. Serendipitous X-ray sky et al 1994 . RO SAT 2RXP' Voges et al 1999' ROSAT 1RXH' 

surveys, based on the field data from individual pointed obser- RQSAT Team 2000; ASCA AMSS: Ueda et al ] 2005) have been 

vations, take advantage o f the relatively wide field of view af- ±e basis for numerous studies and have made a significant con . 

c , „■ . . , , .. r „,, tribution to our knowledge of the X-ray sky and our understand- 

Send offprint requests to: M.G. Watson b J J 

* Based on observations obtained with XMM-Newton, an ESA sci- ln 8 of the nature of the vanous Galactlc and extragalactic source 

ence mission with instruments and contributions directly funded by populations. 

ESA Member States and NASA. Tables D. 1 and D.2 are only available The XMM-Newton observatory provides unrivalled capabil- 

in electronic form at the CDS via anonymous ftp to cdsarc.u-strasbg.fr ities for serendipitous X-ray surveys by virtue of the large field 

(130.79.128.5) or via http://cdsweb.u-strasbg.fr/cgi-bin/qcat?J/A+A/ of view of the EPIC cameras and the high throughput afforded 
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by the heavily nested telescope modules. This capability guaran- 
tees that each XMM-Newton observation provides a significant 
harvest of serendipitous X-ray sources in addition to data on the 
original target. In addition, the extended energy range of XMM- 
Newton (~ 0.2 - 12 keV) means that XMM-Newton detects sig- 
nificant numbers of obscured and hard-spectrum objects that are 
absent in many earlier soft X-ray surveys. 

This paper describes the Second XMM-Newton 
Serendipitous Source Catalogue (2XMM) which has been 
created from the serendipitous EPIC data from from 3491 
XMM-Newton pointed observations made over a ~ 7-year 
interval since launch in 1999. The XMM-Newton serendipitous 
source catalogues are produced by the XMM-Newton Survey 
Science Centre (SSC), an international consortium of ten 
European institutions, led by the University of Leicester, as 
a formal project activity performed on behalf of ESA. The 
catalogues are based on the EPIC source lists produced by 
the scientific pipe-line used by the SSC for the processing 
of all the XMM-Newton data. The first serendipitous source 
catalogue, 1XMM, was released in 2003 (Watson et al. 2003a; 
XMM-SSC 2003). The current 2XMM catalogue incorporates 
a wide range of improvements to the data processing, uses the 
most up-to-date instrument calibrations and includes a large 
number of new parameters. In parallel, the 2XMM catalogue 
processing also produces a number of additional data products, 
for example time-series and spectra for the brighter individual 
X-ray sources. A pre-release version of the current catalogue, 
2XMMp (XMM-SSC 2006), was made public in 2006. This 
includes ~ 65% of the fields and ~ 75% of the sky area covered 
by 2XMM, while - 88% of all 2XMMp sources appear in the 
2XMM catalogue. Around 56% of all 2XMM sources already 
have an entry in the 2XMMp catalogue. 

The 2XMM catalogue provides an unsurpassed sky area for 
serendipitous science and reaches a flux limit corresponding to 
the dominant extragalactic source contribution to the cosmic 
X-ray background. The catalogue is part of a wider project to 
explore the source populations in the XMM-Newton serendip- 
itous survey (the XID project; Watson et al. 2001; Watson et 
al. 2003b) through optical identification of well-defined sam- 
ples of serendipitous sources (e.g., Barcons et al. 2002, 2007; 
Delia Ceca et al. 2004, Caccianiga et al. 2008, Motch et al. 
2002; Schwope et al. 2004; Page et al. 2007; Yuan et al. 2003; 
Dietrich et al. 2006). Indeed these identification programs were 
effectively based on less mature versions of the XMM-Newton 
catalogue data processing. XMM-Newton serendipitous survey 
results have also been used to study various statistical properties 
of the populations such as X-ray spectral characteristics, source 
counts, angular clustering, and luminosity functions (Severgnini 
et al. 2003; Mateos et al. 2005; Carrera et al. 2007; Caccianiga 
et al. 2007; Mateos et al. 2008; Delia Ceca et al. 2008; Ebrero 
et al. 2008). Other projects based on XMM-Newton serendipi- 
tous data include the HELLAS2XMM survey (Baldi et al. 2002; 
Cocchia et al. 2007). 

The 2XMM serendipitous catalogue described here is com- 
plementary to "planned" XMM-Newton surveys which provide 
coverage of much smaller sky areas, but often with higher sen- 
sitivity, thus exploring the fainter end of the X-ray source pop- 
ulation. The deepest such surveys, such as the Lockman Hole 
(Hasinger et al. 2001; Brunner et al. 2008) and the CDF-S 
(Streblyanska et al. 2004), cover essentially only a single XMM- 
Newton field of view, have total integration times ~ 300 - 
lOOOks and reach fluxes ~ few xl0 _16 ergcm _2 s _1 , close to 
the confusion limit. XMM-Newton has also carried out con- 
tiguous surveys of various depths covering much larger sky 



areas utilising mosaics of overlapping pointed observations to 
achieve the required sensitivity and sky coverage. Currently 
the largest contiguous XMM-Newton survey is the XMM-LSS 
(Pierre at al. 2007) covering ~5 deg 2 with typical exposure time 
10-20ks per observation. Other medium-deep surveys of 1- 
2 deg 2 regions include the SXDS (~ 1.1 deg 2 , 50-100ks ex- 
posures; Ueda et al. 2008), the COSMOS surveys (~ 2 deg 2 , 
~ 80 ks exposures; e.g., Cappelluti et al. 2007; Hasinger et al. 
2007), and the Marano field survey (Krumpe et al. 2007). These 
larger area surveys typically reach limiting fluxes of 10~ 14 to 
< 10 _15 ergcm _2 s _1 . 

We also note that Chandra observations have been used 
to compile a serendipitous catalogue including ~ 7000 point 
sources (the ChaMP catalogue; Kim et al. 2007) and plans are 
underway to compile a serendipitous catalogue from all suitable 
Chandra observations (Fabbiano et al. 2007). 

The paper is organised as follows. Section [2] introduces 
the XMM-Newton observatory. Section |3] presents the XMM- 
Newton observations used to create the catalogue and the charac- 
teristics of the fields. Section |4] outlines the XMM-Newton data 
processing framework and provides a more detailed account of 
the EPIC data processing, focusing in particular on source detec- 
tion and parameterisation, astrometric corrections and flux com- 
putation. Section |5]provides an account of the automatic extrac- 
tion of time-series and spectra for the brighter sources, while 
Sect. [6] outlines the external catalogue cross-correlation under- 
taken. Section|7]describes the quality evaluation undertaken and 
some recommendations on how to extract useful sub-samples 
from the catalogue. Section [8] describes additional processing 
and other steps taken to compile the catalogue including the 
identification of unique sources. The main properties and char- 
acterisation of the catalogue is presented in Sect. [9] Section [TO] 
summarises access to the catalogue and plans for future updates 
to 2XMM, and Sect.QT]gives a summary. 

2. XMM-Newton observatory 

To provide the essential context for this paper, the main fea- 
tures of the XMM-Newton observatory are summarised here, 
with particular emphasis on the EPIC X-ray cameras from which 
the catalogue is derived. 

The XMM-Newton observatory (Jansen et al. 2001), 
launched in December 1999, carries three co-aligned grazing - 
incidence X-ray telescopes, each comprising 58 nested Wolter- 
I mirror shells with a focal length of 7.5 m. One of these tele- 
scopes focuses X-rays directly on to an EPIC (European Photon 
Imaging Camera) pn CCD imaging camera (Striider et al. 2001). 
The other two feed two EPIC MOS CCD imaging cameras 
(Turner et al. 2001) but in these telescopes about half the X-rays 
are diverted, by reflection grating arrays (RGA), to the reflec- 
tion grating spectrometers (RGS; den Herder et al. 2001) which 
provide high resolution (A/AA as 100 - 800) X-ray spectroscopy 
in the 0.33-2.5 keV range. The EPIC cameras acquire data in 
the 0. 1 - 15 keV range with a field of view (FOV) ~ 30 arcmin- 
utes diameter and an on-axis spatial resolution ~ 5 arcseconds 
FWHM (MOS being slightly better than pn). The physical pixel 
sizes for the pn and MOS cameras is equivalent to ~ 1 and 
~ 4 arcseconds, respectively. The on-axis effective area for the 
pn camera is approximately 1400 cm 2 at 1.5 keV and 600 cm 2 
at 8 keV while corresponding MOS effective areas are about 
550 cm 2 and 100 cm 2 , respectively. The energy resolution for 
the pn camera is ~ 120eV at 1.5 keV and ~ 160ev at 6keV 
(FWHM), while for the MOS camera it is -90 eV and ~ 135 eV, 
respectively. The EPIC cameras can be used in a variety of differ- 
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Fig. 1. Hammer- Aitoff equal area projection in Galactic coordi- 
nates of the 3491 2XMM fields. 



ent modes and with several filters (see Sect. 13. It . In addition to 
the X-ray telescopes, XMM-Newton carries a co-aligned, 30 cm 
diameter Optical Monitor (OM) telescope (Mason et al. 2001) 
which provides an imaging capability in three broad-band ultra- 
violet filters and three optical filters, spanning 1800 A to 6000 A; 
two additional grism filters permit low dispersion ultra-violet 
and optical-band spectroscopy. The construction of a separate 
catalogue of OM sources is in preparation. 

A number of specific features of XMM-Newton and the 
EPIC cameras which are referred to repeatedly in this paper are 
collected together and summarised in Appendix|A]together with 
the relevant nomenclature. 



3. Catalogue observations 

3.1. Data selection 

XMM-Newton observation^] were selected for inclusion in the 
2XMM catalogue pipeline simply on the basis of their pub- 
lic availability and their suitability for serendipitous science. In 
practice this meant that all observations that had a public release 
date prior to 2007 May 01 were eligible. A total of 3491 XMM- 
Newton observations (listed in Appendix |Bj were included in 
the catalogue; their sky distribution is shown in Fig. Q] Only 
a few observations(83) were omitted, typically because a valid 
ODf0 was not available or because of a fewunresolved process- 
ing problems. The field of view (FOV) of an XMM-Newton 
observation (the three EPIC cameras combined) has a radius 
~ 15 arcminutes. The XMM-Newton observations selected for 
the 2XMM catalogue cover only ~ 1 % of the sky (see Sect. 19.21 
for a more detailed discussion). Certain sky regions have con- 
tiguous multi-FOV spatial coverage, but the largest such region 
is currently < 10 deg 2 . 

By definition the catalogue observations do not form a ho- 
mogeneous set of data. The observations selected have, for ex- 
ample, a wide sky distribution (see Fig. Q] where ~ 65% are at 
Galactic latitude \b\ > 20°), a broad range of integration times 
(Fig. |2]i and astrophysical content (Sect. 13. 2t , as well as a mix- 
ture of EPIC observing modes and filters, as follows. 

The EPIC cameras are operated in several modes of data ac- 
quisition. In full-frame and extended full-frame modes the full 
detector area is exposed, while for the EPIC pn large window 
mode only half of the detector is read out. A single CCD is used 



1 An observation is defined as a single science pointing at a fixed 
celestial target which may consist of several exposures with the XMM- 
Newton instruments. 

2 The Observation Data File is a collection of standard FITS format 
raw data files created from the satellite telemetry. 




Exposure time (ksec) 

Fig. 2. Distribution of total good exposure time (after event filter- 
ing) for the observations included in the 2XMM catalogue (for 
each observation the maximum time of all three cameras per ob- 
servation was used). 



Table 1. Data modes of XMM-Newton exposures included in 
the 2XMM catalogue. 



Abbr. 


Designation 


Description 


MOS cameras: 




PFW 


Prime Full Window 


covering full FOV 


PPW2 


Prime Partial W2 


small central window 


PPW3 


Prime Partial W3 


large central window 


PPW4 


Prime Partial W4 


small central window 


PPW5 


Prime Partial W5 


large central window 


FU 


Fast Uncompressed 


central CCD in timing mode 


RFS 


Prime Partial RFS 


central CCD with different frame 
time ('Refreshed Frame Store') 


pn camera: 




PFWE 


Prime Full Window 
Extended 


covering full FOV 


PFW 


Prime Full Window 


covering full FOV 


PLW 


Prime Large Window 


half the height of PFW/PFWE 



for small window, timing and burst mode (not used for source 
detection). In the case of MOS the outer ring of 6 CCDs always 
remain in standard imaging mode while the central MOS CCD 
can be operated separately: in partial window modes only part of 
the central CCD is read out, and in fast uncompressed and com- 
pressed modes the central CCD is in timing mode and produces 
no imaging data. In the MOS refreshed frame store mode the 
central CCD has a different frame time and the CCD is not used 
for source detection. Table[T]lists all the EPIC camera modes of 
observations incorporated in the catalogue, while Fig. [3] shows 
their sky footprints. 

Each XMM-Newton camera can be used with a different 
filter: Thick, Medium, Thin, and Open, the choice depending 
on the degree of optical blockings desired. Table [2] gives an 
overview of the data modes and filter settings used for the 
2XMM observations. No Open filter exposures passed the selec- 
tion criteria (cf. Sect. 14. It , while about 20% of pn observations 
are taken in timing, burst, or small window mode. 



see Appendix lAl 
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Fig. 3. Typical sky footprints of the different observing modes 
(the FOV is ~ 30')- Noticeable are the CCD gaps as well as 
columns and rows excluded in the filtering process. The effects 
of vignetting and exclusion of CCDs due to much lower expo- 
sure times are not shown. Top row: MOS full window mode; 
MOS partial window W3 or W5 mode; MOS partial window W2 
or W4 mode. Bottom row: MOS fast uncompressed, fast com- 
pressed, or RFS mode; pn full window mode; pn large window 
mode. 

Table 2. Characteristics of the 3491 XMM-Newton observations 
included in the 2XMM catalogue. 



Camera 


Modes 




Filters 




Total 




full" window 6 other c 


thin 


medium 


thick 




pn 


2441 233 


1233 


1259 


182 


2674 


MOS1 


2560 605 219 


1314 


1772 


298 


3384 


MOS2 


2612 655 127 


1314 


1777 


303 


3394 



" PFWE and PFW modes 

pn PLW mode and any of the various MOS PPW modes 
c other MOS modes (FU, RFS) 



3.2. Target classification and field characteristics 

The 2XMM catalogue is intended to be a catalogue of serendipi- 
tous sources. The observations from which it has been compiled, 
however, are pointed observations which typically contain one 
or more target objects chosen by the original observers, so the 
catalogue contains a small fraction of targets which are by def- 
inition not serendipitous. More generally, the fields from which 
the 2XMM catalogue is compiled may also not be representative 
of the overall X-ray sky. 

To avoid potential selection bias in the use of the cata- 
logue, an analysis to identify the target or targets of each XMM- 
Newton observation has been carried out. Additionally, an at- 
tempt has been made to classify each target or the nature of the 
field observed; this provides additional information which can 
be important in characterising their usefulness (or otherwise) 
for serendipitous science. In practice the task of identifying and 
classifying the observation target is to some extent subjective 
and likely to be incomplete (only the investigators of that ob- 
servation know all the details). Here, the main results of the ex- 
ercise are summarised. A more detailed description is given in 
AppendixICl 

- Of the total 3491 observations included in 2XMM, the target 
could be unambiguously resolved in terms of its coordinates 
and classification in the vast majority of cases (~98%) 



- In the full set of targets, ~ 50% are classified as spatially 
unresolved objects, ~ 10% as extended objects with small 
angular extent (< 3'), ~ 22% as larger extended objects, 
and around 15% can be considered to have no discrete tar- 
get leaving only ~2% of unknown or problematic cases (see 
TableETJ. 

- Around 10% of observations were obtained for calibration 
purposes; around 3% of targets are "targets of opportunity". 

- Anticipating the discussion in Sect. 19.11 around 2/3 of the 
intended targets are unambiguously identified in their XMM- 
Newton observations. 

Figure |4] illustrates the wide variety in field content (im- 
ages are usually combinations of pn and MOS total-band images 
that include out-of-FOV areas). Panel (a) shows typical XMM- 
Newton observations which may be considered representative of 
most of the observations used for the catalogue. Panel (b) shows 
the variety of astrophysical content; in many of these cases the 
source detection is affected by a dominant bright point or ex- 
tended source, or by crowding in high density regions. Lastly 
panel (c) illustrates various instrumental or detector artefacts 
which, although relatively rare, cause significant source detec- 
tion issues. The most common of these, affecting ~ 6% of the 
observations each, are the OOT events and X-ray scattering off 
the RGA (see Appendix lAl for terminology). Both effects occur 
for all sources but only become significant for the brightest ob- 
jects where they may cause spurious detections and background 
subtraction problems (as OOT events of piled-up sources are not 
represented properly in the background maps). The rarer prob- 
lems (also illustrated in panel (c)) are: 

- PileupPH which can make the centroiding of a source diffi- 
cult, resulting in off-centre detections as well as spurious ex- 
tended source detection. 

- The shadows from the mirror spider can be visible in the 
PSHj wings of the very brightest sources and affect the back- 
ground maps, that is, the source parameters in these areas are 
uncertain. 

- Due to the nature of the background maps (spline maps, 
see Sect. 14.4.2b . sharp edges, caused, for example, by noisy 
CCDs, can not be represented well and cause spurious de- 
tections. Note that this problem can affect the parameters of 
real sources as well. 

- Finally, the telescope baffles allow photons from a narrow 
annular region of sky outside the nominal FOV to reach the 
detectors via a single reflection, instead of the two reflec- 
tions required for correct focusing. Bright X-ray objects in 
this annular region can give rise to bright arcs in the image, 
as shown in panel (v), which typically produce numerous 
spurious detections. 



4. Data Processing 

The SSC operates a data-processing system on behalf of ESA 
for the processing of XMM-Newton pointed observations. The 
system, which can be considered as a 'pipeline', uses the XMM- 
Newton Science Analysis Software (SAS0) to generate high- 
level science products from ODFs. These science products are 
made available to the principal investigator and ultimately the as- 
tronomy community through the XMM Science Archive (XSA; 



4 see Appendix [Al 

5 The description and documentation are available on-line at the 
ESAC web site http://xmm2.esac.esa.int/sas/ 
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Fig. 4. a) Examples of typical 2XMM EPIC images (north is up). From left to right: (i) medium bright point source; (ii) deep field 
observation; (iii) shallow field observation with small extended sources; (iv) distant galaxy cluster. 




Fig. 4. b) Examples of variation in astrophysical content of 2XMM observations (north is up); in most of these extreme cases the 
source detection is problematic. Top row, from left to right: (i) bright extended emission from a galaxy cluster; (ii) emission from a 
spiral galaxy which includes point sources and extended emission; (iii) very bright extended emission from a SNR; (iv) filamentary 
diffuse emission. Second row: (v) complex field near the Galactic Centre with diffuse and compact extended emission; (vi) two 
medium-sized galaxy clusters; (vii) complex field of a star cluster; (viii) bright point source, off-centre. 




Fig. 4. c) Examples of instrumental artefacts causing spurious source detection (north is up). From left to right: (i) bright source with 
pileup and OOT events; (ii) very bright point source showing obvious pileup, shadows from the mirror spider, and scattered light 
from the RGA; (iii) the PSF wings of a bright source spread beyond the unused central CCD causing a brightening of the edges on 
the surrounding CCDs (which may not be well represented in the background map); (iv) obvious noisy CCDs for MOS1 (CCD#4) 
and for MOS2 (CCD#5) to the top right; (v) numerous and bright single reflections from a bright point source outside the FOV, with 
a star cluster to the left. See Appendix lAl for terminology. 



Arviset et al. 2007). In October 2006, the SSC began to repro- 
cess every available pointed-observation data-set from the start 



of the mission. The aim was to create a uniform set of science 
products using an up-to-date SAS and a constant set of XMM- 
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Newton calibration file^j (the appropriate subset of calibration 
files for any given observation was selected based on the obser- 
vation date). Of 5628 available observations, 5484 were success- 
fully processed. These included public as well as (at that time) 
proprietary datasets (the data selection for 2XMM observations 
is discussed in Sect. 13. II ). The complete results of the processing 
have been made available through the XSA. The new system in- 
corporated significant processing improvements in terms of the 
quality and number of products, as described below. The remain- 
der of this section details those aspects of the EPIC processing 
system which are pertinent to the creation of the 2XMM cata- 
logue. 

The main steps in the data-processing sequence are: produc- 
tion of calibrated detector events from the ODF science frames; 
identification of the appropriate low-background time intervals 
using a threshold optimised for point-source detection; identifi- 
cation of 'useful' exposures (taking account of exposure time, 
instrument mode, etc); generation of multi-energy-band X-ray 
images and exposure maps from the calibrated events; source 
detection and parameterisation; cross-correlation of the source 
list with a variety of archival catalogues, image databases and 
other archival resources; creation of binned data products; ap- 
plication of automatic and visual screening procedures to check 
for any problems in the data products. This description and the 
schematic flowchart in Fig. |5]pro vide a rather simplified view of 
the actual data-processing system. They, and the further detail 
that follows, are focused on those aspects that are important for 
an insight into the analysis processes that the EPIC data have 
undergone to generate the data products. A complete description 
of the data-processing system and its implementation are outside 
the scope of this paper. 

The criteria employed to select exposures for initial process- 
ing and those to be used for subsequent source detection and 
source-product generation are explained further in Sect. 14. 11 but 
are briefly introduced here. Several suitability tests were applied 
during processing to limit source detection and source-specific 
product creation to imaging exposures of suitable quality, mainly 
by (a) restricting the merging of exposures (and hence source de- 
tection) to imaging exposures with a minimum of good-quality 
exposure time, and (b) limiting the extraction of source-specific 
products to suitably bright sources. 

4.1. Selection of exposures 

Most XMM-Newton observations comprise a single exposure 
with each of the cameras, although a significant number of obser- 
vations are missing exposures in one or more of the three cam- 
eras for a variety of operational and observational reasons. To 
avoid generating data products of little or no scientific use, ex- 
posures for each observation were initially selected for pipeline 
processing when: 

1. the exposure duration was > 1000 seconds; 

2. the exposure was taken through a scientifically useful filter. 
In practice this requirement rejected all exposures for which 
the filter position was closed, calibration, or undefined. The 
possible filters are Medium, Thick, Thinl, Thin2 (pn only), 
and Open. 

After event-list processing (Sect. 14. 3K exposures were selected 
for image creation according to the following criteria: 



input ODF 


MOS1 MOS2 PN 

XXX 



Initial exposure selection 



Event list processing 



LX J 




Image creation 



Merging of images tor each 
instrument across exposures 



Source detection and 
parameterisation 



Source-Specific Product creation 
(for all suitable exposures) 



□ 



ndicates processing of any 
exposures reaching this stage 



Indicates processing of merged 
images reaching this stage 



6 As available on 2006 July 02 plus three additional calibration files 
forMOS2andRGSl. 



Fig. 5. A simplified schematic of the processing flow for EPIC 
image data. Early processing steps treat the data from each in- 
strument and exposure separately. Source detection and parame- 
terisation are performed simultaneously on one image from each 
energy band from each instrument. Source-specific products can 
be made, subsequently, from any suitable exposures in the ob- 
servation. Observation-level, exposure-level and source-specific 
products are screened before archiving and use in making the 
catalogue. 



3. The quality checks during the event-list processing had been 
successful. 

4. The exposure had been taken in a mode which could usefully 
be processed by the source detection stage, cf. Table Q] The 
pn burst, timing, and small window modes were rejected (the 
effective FOV in the latter mode is small, i.e., 258" x 262", 
making the background fitting stage of the source detection 
problematic). For the MOS, all modes, including the outer 
CCD imaging component of modes where the central CCD 
was windowed, missing (non-imaging modes), or modified 
(Refreshed Frame Store mode), were included. 

A further set of criteria selected the appropriate images for the 
detection stage (cf. Sect. 14.4b which ensured that only high qual- 
ity images were used. 

5. Background filtering (see Sect. |4.3l l must have been suc- 
cessfully applied. Cases where the sum of high background 
GTIsj was less than 1000 seconds were rejected as unusable. 
Without background filtering the source detection is usually 
of limited value due to the much higher net background. 

6. Each of the five images of an exposure (in the energy bands 
1-5, see Table|3j had to contain at least one pixel per image 
with more than one event. This further avoided low exposure 
images being used. 



see Appendix lAl 
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7. The image must have been in a data mode useful for source 
detection (this excluded modes only used for engineering test 
purposes). 

8. Where more than one exposure with a particular camera 
passed the above selection criteria, those exposures with the 
same filter and data mode were merged and then only the 
exposure group with the maximum net exposure time was 
chosen for use in the source detection stage. 



4.2. Event-List Processing 

Event-list processing was performed on all initially selected ex- 
posures. A number of checks and corrections were applied to the 
event lists of the individual CCDs before they were merged into 
a single event list per exposure. Once merged, a further set of 
checks and corrections was performed. At each stage of the pro- 
cessing, a quality assessment of the event lists decided whether 
to continue the processing. The main steps in processing the 
event lists were as follows. 



- The CCD event lists were first examined separately on a 
frame by frame basis: corrections were applied to account 
for telemetry dropouts; gain and charge transfer inefficiency 
(CTI) corrections were made; a GTI list for each CCD was 
created; frames identified as bad and events belonging to 
them were flagged; event pattern^ were identified; events 
were flagged if they met criteria such as being close to 
a bad pixel or edge of the CCD, which were important 
to later processing (standard #XMMEA_EM for MOS and 
#XMMEA_EP for pn); invalid events were identified and 
discarded; events caused by CCD bad pixels were identi- 
fied and removed; the fraction of the detector area in which 
events could not have been detected due to cosmic-ray events 
was recorded for each frame; events caused by CCD bad 
pixels as well as cosmic -ray events were identified and re- 
moved; EPIC MOS CCDs operating in low-gain mode were 
discarded from the event lists. 

- At the point where the event lists from individual CCDs 
were merged into exposure event lists, the event positions 
were converted from CCD pixel coordinates to the detector 
(CAMC00RD2) and sky coordinate systems. This step includes 
a randomisation within each CCD pixel to eliminate Moire 
effects. The MOS camera event times were also randomised 
within the frame time, to avoid a strong Fourier peak at the 
frame period and to avoid possible beat effects with other 
instrumental frequencies. Time randomisation was not per- 
formed on pn event lists as the frame time is much shorter 
than for the MOS. 

In addition, the spacecraft attitude file was examined for pe- 
riods of the observation when the spacecraft pointing direction 
varied by less than 3 arcminutes from the median of the pointing 
measurements for the observation. The 3-arcminute limit was 
imposed to avoid degradation of the effective PSF0 which could 
arise from co-adding data with different off-axis angles and to 
avoid a potentially large (but probably low exposure) extension 
of the observed sky field. These attitude GTIs were then further 
restricted for each camera to cover only that part of the observa- 
tion when the camera was active. 



4.3. Creation of multi-band images and exposure maps 

Periods of high background (mostly due to so-called 'soft pro- 
ton' flares) can significantly reduce the sensitivity of source de- 
tection. Since events caused by such flares are usually much 
harder than events arising from typical X-ray sources, back- 
ground variation can be disentangled from possible time vari- 
ation of the sources in the field by monitoring events at energies 
higher than the 12keV upper boundary to the 'science band', be- 
yond which point contributions from cosmic X-ray sources are 
very rare. A time series of such events, including most of the 
FOV, was constructed for each exposure. This event rate was 
used as a proxy for the science-band background rate. 

The generation of the background time-series differed in de- 
tail between pn and MOS cameras, in particular in terms of the 
events used to form the time-series. The MOS high-energy back- 
ground time-series were produced from single-pixel events with 
energies above 14 keV from the imaging CCDs. The background 
GTIs were taken to be those time intervals of more than 100 s in 
duration with a count rate of less than 2 ct ks~' arcmin~ 2 . The pn 
high-energy background time-series were produced in the 7.0- 
15keV energy range. The background GTIs were taken to be 
those time intervals of more than 100 s in duration with a count 
rate of less than 10 ct ks" 1 arcmin 2 . 

These threshold count rates were chosen as a good compro- 
mise between reducing background and preserving exposure for 
detecting point sources in the relatively short exposures which 
make up the bulk of the XMM-Newton observations. For com- 
parison, the average quiet level in the MOS cameras, for exam- 
ple, is ~ 0.5 ctks -1 arcmin 2 . 

For all exposures in imaging mode, images were created for 
energy bands 1-5, as listed in Table [3] from selected events fil- 
tered by event-list, attitude, and high background GTIs (except 
where the sum of all high background GTIs was less than 1000 
seconds in which case no background filtering was applied). 
Note that the event-list GTIs are CCD dependent and the result- 
ing image can have a different exposure time in each CCD. The 
events for pn images were selected by pattern < 4 (for band 1 
a stricter requirement of pattern = was adopted) and a cut in 
CCD coordinates (Y > 12) to reduce bright low-energy edges. 
Events on CCD columns suffering a particularly large energy 
scale offset as well as events outside the FOV were excluded. 
For MOS images events with pattern < 12 were selected and 
events outside the FOV were excluded. The images are tangent- 
plane projections of celestial coordinates and have dimensions 
of 648 x 648 image pixels, with a pixel size of 4" x 4". 



Table 3. Energy bands used in 2XMM processing 



Band 


Energy band 


Notes 


number 


(keV) 




1 


0.2- 0.5 




2 


0.5- 1.0 




3 


1.0- 2.0 




4 


2.0- 4.5 




5 


4.5-12.0 




6 


0.2- 2.0 


'soft band' 


7 


2.0-12.0 


'hard band' 


8 


0.2-12.0 


'total band' 


9 


0.5- 4.5 


'XID band' 



see AppendixlAl 
9 see AppendixlAl 



Exposure maps represent the GTI-filtered on-time multi- 
plied by the (spatially dependent) vignetting function, adjusted 
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to reflect telscope and instrumental throughput efficiency. They 
were created for each EPIC exposure in imaging mode in en- 
ergy bands 1-5 using the calibration information on mirror vi- 
gnetting, detector quantum efficiency, and filter transmission. 
The exposure maps were corrected for bad pixels, bad columns 
and CCD gaps (cf. Fig. [3]) as well as being multiplied by an OOT 
factor which is 0.941 1 for pn full frame modes, 0.97815 for pn 
extended full frame modes, and 1.0 for all other pn and MOS 
modes. 



4.4. Source detection & parametehsation 

The fundamental inputs to the 2XMM catalogue are the mea- 
sured source parameters which were extracted from the EPIC 
image data by the multi-step source detection procedure outlined 
below. Each step was carried out simultaneously on each image 
of the five individual bands, 1-5, and of the three cameras. Note 
that the source counts and rates derived here refer to the fully in- 
tegrated PSF. 

As a first step, a detection mask was made for each cam- 
era. This defines the area of the detector which is suitable for 
source detection. Only those CCDs where the unvignetted ex- 
posure map values were at least 50% of the maximum exposure 
map value were used for source detection. 

4.4.1 . Sliding-box source detection - local mode 

An initial source list was made using a 'box detection' algo- 
rithm. This slides a search box (20" x 20") across the image 
defined by the detection mask. The size of the box comprises 
~ 50% of the encircled energy fraction of the on-axio PSF. In 
its first application ('local mode') the algorithm derived a local 
background from a frame (8" wide) immediately surrounding 
the search box. In each of the five bands from each of the three 
cameras, the probability, Pr(k, x), and corresponding likelihood, 
L,, were computed from the null hypothesis that the measured 
counts k or more in the search box result from a Poissonian fluc- 
tuation in the estimated background level, x, i.e.: 

L = -In P r {k,x), 

where Pp is the incomplete Gamma function: 

Pr(k,x) = -$- f e-'rtf, 
T(£) Jo 

and 



T(k) 



-f 

Jo 



Q-'^dt. 



The sum of N independent likelihoods, after multiplication by 2, 
is expected to have, in the limit of large N, the same probability 
distribution as x 1 for Af degrees of freedom (Cash 1979). For this 
reason the total-band EPIC box-detect likelihood was calculated 
by summing the band-specific likelihoods in this way and insert- 
ing the result in the standard formula for the probability for^f 2 to 
equal or exceed the measured value in the null hypothesis, i.e., 



L * -ln(l - P T (N, L')) with L' = Y L, , 



(1) 



i=] 



where N is the number of energy bands and cameras in- 
volved. All sources with a total-band EPIC likelihood above 5 
were included in the output list. 



4.4.2. Sliding-box source detection - map mode 

After the first pass to detect sources, a background map was cre- 
ated for each camera and energy band. Using a cut-out radius 
dependent on source brightness in each band (specifically the 
radius where the source counts per unit area fell below 0.002 
ct arcsec~ 2 ), areas of the image where sources had been detected 
were blanked out. A 12 x 12-node spline surface was fitted to the 
resulting source-free image to calculate a smoothed background 
map for the entire image. For the pn images the contribution of 
OOT events was also modelled into the background maps. 

A second box-source-detection pass was carried out, creating 
a new source list, this time using the spline background maps 
('map mode') which increased the source detection sensitivity 
compared to the local-mode detection step. The box size was 
again set to 20" x 20". Source counts were corrected for the part 
of the PSF falling outside the detection box. Only sources with 
a total-band EPIC likelihood, cf. eq. (Q]), above 5 were included 
in this map-mode source list. 



4.4.3. Source parameter estimation by maximum likelihood 
fitting 

A maximum likelihood fitting procedure was then applied to the 
sources emerging from the map-mode detection stage to calcu- 
late source parameters in each input image. This was accom- 
plished by fitting a model to the distribution of counts over a 
circular area of radius 60". The energy-dependent model value, 
e,-, in pixel, i, is given by 



e,- = bi + aS i 



(2) 



where b t is the background, derived from the background map, 
Si is the source profile (i.e. the PSF, convolved with the source 
extent model (Sect. I4.4-.41 >') and a is a scalar multiplier of the 
source profile. 

For each source, the fitting procedure minimised the C- 
statistic (Cash 1979) 



C = 2^(e, -n,lne,) 



10 The encircled energy fraction does not strongly depend on off-axis 
angle. 



to find the best set of model parameters, where e,- is the expected 
model value in pixel i (eqn. (f2]i), «,- the measured number of 
counts in pixel i, and N is the total number of pixels over all 
images used. 

Free parameters of the fit were source position, extent, and 
source count rate. Positions and extent were constrained to be 
the same in all energy bands and for all cameras while the count 
rates were fitted separately for each camera and energy band. 
The fitting process used the multi-band exposure maps to take 
account of various instrumental effects (cf. Sect. 14. 3t in deriving 
the source counts c s : 

c s (x,y) = R s (x,y) t mzv {x,y) , 

where R s (x,y) is the source count rate in each image pixel as 
predicted by the instrumental PSF and source extent model and 
tmap{x,y) is the corresponding value of the exposure map. 

After arriving at those values of the source parameters which 
minimize C, the detection likelihood (formally, the probability 
of the null hypothesis) for those optimum values is then calcu- 
lated. Cash's prescription for this is to form the difference 



AC = C 



null 



Lbest 
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where C nu n is the C-statistic of the null hypothesis model (i.e., 
with zero source flux) and Cbest is the minimum result returned 
by the fitting routine. According to Cash, AC is distributed ap- 
proximately as x 1 for v degrees of freedom, where v is the num- 
ber of fitted parameters. The probability P(x 2 > AC) of obtain- 
ing the calculated value of AC or greater by chance fluctuations 
of the detected background can therefore be obtained in terms of 
the incomplete Gamma function Pp as follows: 



P(X Z > AC) = 1 



v AC 
Pr(- r -). 



Note that the values L which are stored in the source lists are 
log-likelihoods, formed from L — - ln(P).P1 

Since the C values are simple sums over all image pixels 
included in the fit, one may calculate AC, for each band i then 
add the results together to generate a total-band AC to tai without 
destroying the x 1 equivalence: only the number of degrees of 
freedom changes. The source detection procedure thus calculates 
AC, and hence L, for each zth band, using v = 3 (= 4 if source 
extent is also fitted), then sums the AC, and calculates L to tai using 
v - N + 2 {— N + J), where N is the number of bands. 

The fitting of the input sources was performed in the order of 
descending box(map)-detect detection likelihood. After each fit 
the resulting source model was added to an internally maintained 
background map used for the fitting of subsequent sources. With 
this method the background caused by the PSF wings of brighter 
sources is taken into account when fitting the fainter sources. 
All sources (as detected by the sliding-box in map mode) with a 
total-band detection likelihood > 6, as determined by the fitting 
process, were included in the output source list. Note that for 
individual cameras and energy bands, the fitted likelihood values 
can be as low as zero. 

The calculation of the parameter errors made use of the fact 
that AC follows the ^-distribution. The 68% confidence inter- 
vals were determined by fixing the model to the best-fit param- 
eters and then subsequently stepping one parameter at a time in 
both directions until C = C^,, + 1 is reached (while the other free 
parameters were kept fixed). The upper and lower bound errors 
were then averaged to define a symmetric error. Note that using 
Cbest+ 1 to determine the 68% confidence intervals is only strictly 
correct in the case that there is one parameter of interest. In the 
case of the fitting performed here, this requires that the position 
and amplitude parameters are essentially independent (i.e. that 
the cross-correlation terms of the error matrix are negligible). 
This has been found through simulations to be an acceptable ap- 
proximation in the present case (see also the discussion of the 
astrometric corrections in Sect. 14.51 ). 

Four camera-specific X-ray colours, known as hardness ra- 
tios (HR1 -HR4), were obtained for each camera by combining 
corrected count rates from energy bands n and n + 1: 

HRn = (R n +i-R„)/(R n +i+R n ) 

where R„ and R„+i are the corrected count rates in energy bands 
n and n + 1 ( n — 1 - 4). Count rates, and therefore hardness ra- 
tios, are camera dependent. In addition, they depend on the filter 
used for the observation, especially for HR1. Note that HR1 is 



11 Protassov et al. (2002) have highlighted the dangers of using the 
probabilities derived from likelihood ratio tests when the null hypothe- 
sis is close to the boundary of parameter space. In this regard it is clear 
that it is inappropriate to interpret the detection likelihoods, L, literally 
in terms of detection probabilities. Instead the relation between the like- 
lihood and the detection probability requires calibration via simulations, 
as is discussed in Sect. 19.41 



also a strong function of Galactic absorption, Nr. This needs to 
be taken into account when comparing hardness ratios for dif- 
ferent sources and cameras. It should be stressed that a large 
fraction of the hardness ratios were calculated from marginal or 
non-detections in at least one of the energy bands. Consequently, 
individual hardness ratios should only be deemed reliable if the 
source is detected in both energy bands, otherwise they have to 
be treated as upper or lower limits. Similarly, the errors on the 
hardness ratios will be affected for band-limited count rates in 
the Poisson regime (Park et al. 2006). 

4.4.4. Extended-source parameterisation 

One of the enhancements incorporated in the 2XMM processing 
that was not available in 1XMM was information about the po- 
tential spatial extent of sources and, where detected, a measure 
of that extent. 

The source extent characterisation was realised by fitting a 
convolution of the instrumental PSF and an extent model to each 
input source. The extent model was a/?-model of the form 



f(x,y) 



1 + 



(x - xq) 2 + (y - yoY 



-3/3+1/2 



where /3 was fixed at the canonical value/? = 2/3 for the surface 
brightness distribution of clusters of galaxies (Jones & Forman 
1984; but see Sect. |9.9| for a discussion of problems arising from 
this assumption). The core radius, r c , the 'extent' parameter of 
a source, was fitted with a constraint that r c < 80". Cases with 
r c < 6" were considered to be consistent with a point source and 
r c was reset to zero. 

An extent likelihood based on the C-statistic and the best- 
fit point source model as null hypothesis was calculated in an 
analogous way to that used in the detection likelihood described 
in Sect. 14.4.31 The extent likelihood L ext is related to the prob- 
ability P that the detected source is spuriously extended due to 
Poissonian fluctuation (i.e., the source is point-like) by 

L aa = -\n(P). 

A source was classified as extended if r c > 6" and if the ex- 
tended model improved the likelihood with respect to the point 
source fit such that it exceeded a threshold of L e xt,min = 4. 

Since source extent can be spuriously detected by the confu- 
sion of two or more point sources, a second fitting stage tested 
whether a model including a second source further improved the 
fit. If the second stage found an improvement over the single- 
source fit, the result could be two point sources or a combination 
of one point source and one extended source. Note, however, 
that the previously fitted fainter sources (Sect. |4.4.3t are not re- 
computed in such cases. 

4.5. Astrometric corrections 

The positions of X-ray sources were determined during the max- 
imum likelihood fitting of the source. These positions were 
placed into an astrometric frame determined from the XMM- 
Newton on-board Attitude & Orbit Control Subsystem (AOCS) 
which uses XMM-Newton's two star trackers and its "fine sun 
sensors". The overall accuracy of the XMM-Newton astromet- 
ric frame (i.e., the difference between the XMM-Newton frame 
and the celestial reference frame) is typically a few arcseconds 
although a few observations suffer rather poorer accuracy. 

As the mean positions of bright X-ray sources can be deter- 
mined to a statistical precision of <K 1" in the XMM-Newton 
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images, and typical sources to a precision of 1" - 2", it is clearly 
worthwhile to improve the astrometric precision of the posi- 
tions. This was done on an observation by observation basis by 
cross-correlating the source list with the USNO B1.0 catalogue 
(Monet et al. 2003). This approach depends on the assumptions, 
usually valid, that a significant number of XMM-Newton detec- 
tions will have an optical counterpart in the USNO catalogue 
and that the number of random (false) matches is low. The algo- 
rithm used a grid of trial position offsets (in RA and Dec) and 
rotations between the XMM-Newton frame and the true celes- 
tial frame (as defined by the USNO objects) and determined the 
optimum combination of offset and rotation values which max- 
imised a likelihood statistic related to the X-ray/optical object 
separations. 

To determine whether the offset/rotation parameters so deter- 
mined represented an acceptable solution, an empirically deter- 
mined condition was used. This was based on a comparison of 
the likelihood statistic determined from the analysis with that 
calculated for purely coincidental X-ray/optical matches in a 
given observation, i.e., if there were no true counterparts. 

In practice this approach worked very well at high Galactic 
latitudes, resulting in a high success rate (74% of fields with 
\b\ > 20°), whilst at low Galactic latitudes (and other regions 
of high object density) the success rate was much lower (33% 
of fields with \b\ < 20°). The typical derived RA, Dec offsets 
were a few arcseconds, and a few tenths of a degree in field rota- 
tion, values consistent with the expected accuracy of the nominal 
XMM-Newton astrometric frame as noted above. 

The 2XMM catalogue contains equatorial RA and Dec coor- 
dinates with the above determined astrometric corrections ap- 
plied and corresponding coordinates which are not corrected. 
Where the refined astrometric solution was not accepted, the cor- 
rected and uncorrected coordinates are identical. 

The catalogue also reports the estimated residual component 
of the position errors, cr sys [[jThis has the value 0735 for all de- 
tections in a field for which an acceptable astrometric correction 
was found and l'.'O otherwise. The values of cr sys in the catalogue 
are a new determination of the residual error component based 
on further analysis undertaken after the initial compilation of the 
catalogue was completed. The details of this analysis are given 
in Sect. 19.51 Higher initial values of <x sys (075 and 175, respec- 
tively) were used in earlier stages of the catalogue creation, for 
example in the external catalogue cross-correlation (see Sect. [6]). 



4.6. Flux computation 

The fluxes, F,-, given in the 2XMM catalogue have been obtained 
for each energy band, i, as 

Ft = Rilfi 

where Ri is the corrected source count rate and f, is the energy 
conversion factor (ECF) in units of 10 11 ctcm 2 erg -1 . The ECFs 
depend on camera, filter, data mode, and source spectrum. Since 
the dependence on data mode is low (1-2%), ECF values were 
calculated only for the full window mode which is the most fre- 
quently used (cf. Table|2|. To compute the ECF values, a broad- 
band source spectrum was assumed, characterised by a power 
law spectral model with photon index F = 1 .7 and observed X- 
ray absorption A^h = 3 x 10 20 cirT 2 . As shown in Sect. 19.71 (cf. 
Fig.[T2l. this model provides a reasonable representation of the 



Table 4. Energy conversion factors used to compute 2XMM cat- 
alogue fluxes (in units of 10 11 ctcm 2 erg" 1 ). 



Camera 


Band 


Thin 


Medium 


Thick 


pn 


1 


8.95403 


7.82028 


4.71096 




2 


8.09027 


7.83782 


6.02015 




3 


5.88255 


5.78272 


5.00419 




4 


1.92805 


1.90529 


1.80647 




5 


0.555226 


0.554529 


0.547205 




9 


4.53836 


4.43953 


3.74772 


MOS1 


1 


1.80399 


1.60150 


1.06500 




2 


1.88017 


1.82853 


1.48465 




3 


2.05034 


2.01594 


1.79446 




4 


0.746128 


0.737800 


0.707822 




5 


0.143340 


0.143131 


0.141213 




9 


1.42040 


1.39361 


1.23264 


MOS2 


1 


1.81179 


1.60670 


1.06620 




2 


1.88369 


1.83088 


1.48818 




3 


2.05117 


2.01594 


1.79530 




4 


0.750569 


0.741687 


0.711708 




5 


0.150769 


0.150560 


0.148537 




9 


1.42326 


1.39647 


1.23524 



emission of the bulk of the sources in 2XMM. A single model 
cannot, of course, provide the correct flux conversion for differ- 
ent intrinsic spectra, and the effect of varying the shape of the 
assumed power-law spectrum on the measured fluxes has been 
investigated. For example, for Ar = ±0.3 the fluxes can change 
by ~ 6% and ~ 8% in bands 1 and 5, respectively. The effect 
is much less (< 2%) for bands 2-4 (i.e., between 0.5 keV and 
4.5 keV). Very soft or very hard spectra will, of course, produce 
much greater changes in the conversion factor, particularly in the 
softest and hardest energy bands. 

Note that the fluxes given in 2XMM have not been corrected 
for Galactic absorption along the line of sight. The ECF values 
used in the 2XMM catalogue are shown in Table|4] 

Publicly available response matrices (RMFs) were used in 
the computation of the ECFs_3 For the pn they were on- 
axis matrices for single-only events for band 1 and for single- 
plus-double evento f° r bands 2-5 (epn_ff20jY9_v6.7.rmf, 
epn_ff20j>dY9_v6.7.rmf, respectively). For the MOS cameras 
there has been a significant change in the low energy redistri- 
bution characteristics with time, especially for sources close to 
the optical axis. In addition, during XMM-Newton revolution 
534 the temperatures of both MOS focal plane CCDs were re- 
duced (from -100C to -120C), resulting in an improved spec- 
tral response thereafter (mainly in the energy resolution). To 
account for these effects, epoch-dependent RMFs were pro- 
duced. However, in the computation of MOS ECFs time aver- 
aged RMFs were used (for revolution 534). To be consistent with 
the event selection used to create MOS X-ray images, the stan- 
dard MOS1 and MOS2 on-axis RMFs for patterns 0-12 were 
used (mlJJ34Jm_pall.rmf, m2J>34J.mjpall.rmf). 

Note that for the computation of the ECFs, the effective ar- 
eas used in the spectral fitting were calculated without the cor- 
rections already applied to the source count rates (i.e., instru- 
mental effects including vignetting and bad-pixel corrections, 
see Sect. 14.3b . as well as for the PSF enclosed-energy fraction. 



12 In the catalogue and associated documentation we refer to this as 
a 'systematic' error. This nomenclature is somewhat misleading as the 
true nature of this component of the positional errors is far from clear. 



13 EP IC RMFs are available at 

http://xmm.vilspa.esa.es/external/xmm_sw_cal/calib/epic_files.shtml 

14 Single-only events = pattern 0, single-plus-double events = patterns 
1-4. 
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5. EPIC source-specific product generation 

The 2XMM processing pipeline was configured to automatically 
extract source-specific products, i.e., individual time-series (in- 
cluding variability measures) and spectra for the brighter detec- 
tions. Sources were selected when the following extraction cri- 
teria were satisfied: 1) they had > 500 total -band EPIC counto, 
2) the detector coverage of the source, weighted by the PSF for 
the respective camera, was > 0.5, and 3) the total-band detec- 
tion likelihood for the respective camera was > 15. The decision 
whether to extract products for a source was based solely on it 
meeting these extraction criteria in the (merged) exposures used 
in source detection (Sect. 14. 4\ . However, products for qualify- 
ing sources were subsequently extracted for all exposures (i.e., 
imaging event lists) of an observation that adhered to the general 
exposure selection criteria given in Sect. |4.1| (i.e., items 1 -7). 

Table shows the event selection criteria for the extraction 
of the source products. Instrumental GTIs (stored in the event 
list) are always applied, while GTIs for masking out high back- 
ground flaring (see Sect. 14.3b were only applied to spectra and 
the variability tests. Source data were extracted from a circular 
region of radius r - 28", centred on the detected source position, 
while the background extraction region was a co-centred annu- 
lus with 60" < r < 180". Circular apertures of radius r - 60" 
were masked from the background region for any contaminating 
detection with a likelihood > 15 for that camera. These values 
represent a compromise choice for data extraction by avoiding 
the additional complexity required to implement a variable ex- 
traction radius optimised for each source. Note that the use of 
an aperture -photometry background subtraction procedure here 
differs from the use of the background maps applied at the de- 
tection stage. 



5.1. Spectra 

For each source meeting the extraction criteria, the pipeline 
created the following spectrum-related products: 1) a 
source+background spectrum (grouped to 20 ct/spectral- 
bin) and a corresponding background-subtracted XSPEC 
(Dorman & Arnaud, 2001) generated plot; 2) a background 
spectrum; 3) an auxiliary response file (ARF). Energies below 
0.35 keV are considered to be unreliable for the MOS due 
to low sensitivity and for the pn due to the low-energy noise 
(in particular at the edges of the detector) and, as such, were 
marked as 'bad' in XSPEC terminology. Data around the Cu 
fluorescence line for the pn (7.875 keV < E < 8.225 keV) 
were also marked 'bad'. The publicly available 'cannedo 
RMF associated with each spectrum is conveyed by a header 
keyword. Some examples of the diversity of source spectra 
contained amongst the source-specific spectral products are 
shown in Fig. [6] 

5.2. Time-series 

Light curves for a given source were created with a common 
bin- width (per observation) that was an integer multiple of 10 
seconds (minimum width 10 seconds), determined by the re- 
quirement to have at least 18 ct/bin for pn and at least 5 ct/bin 



15 Where the source was only observed with one or two cameras the 
equivalent EPIC counts were calculated for the absent camera(s) using 
the pn to MOS count ratio 3.5 : 1, representative of the typical source 
count ratios. 

16 Pre-computed for the instrument, mode, event pattern selection and 
approximate detector location of the source. 
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Fig. 6. Examples of auto-extracted 2XMM spectra. Sources are 
serendipitous objects and spectra are taken from the EPIC pn 
unless otherwise stated. Panels: a) a typical extragalactic source 
(Seyfert I galaxy); b) line-rich spectrum of a localised region in 
the Tycho supernova remnant (target); c) MOS 2 spectrum of a 
stellar coronal source (target; H II 1384, Briggs & Pye 2003), 
described by two-component thermal spectrum; d) spectrum of 
the hot intra-cluster gas in a galaxy cluster at z — 0.29 (Kotov, 
Trudolyubov & Vestrand 2006); e) heavily absorbed, hard X-ray 
spectrum of the Galactic binary IGR J16318-4848 (target; Ibarra 
et al. 2007); f) spectrum of a super-soft source with oxygen line 
emission at ~ 0.57 keV; g) a relatively faint source showing a 
two-component spectrum; h) source with power-law spectrum 
strongly attenuated at low energies and with a notable red-shifted 
iron line feature around 6 keV. 



for MOS for the exposures used in source detection. All light 
curves of a given source within an XMM-Newton observation 
are referenced to a common epoch for ease of comparison. 

The light curves themselves can include data taken during 
periods of background flaring because background subtraction 
usually successfully removes its effects. However, in testing for 
potential variability, to minimise the risk of false variability trig- 
gers, only data bins that lay wholly inside both instrument GTIs 
and GTIs reflecting periods of non-flaring background were 
used. 

Two simple variability tests were applied to the separate light 
curves: 1) a Fast Fourier Transform and 2) a^- 2 -test against a 
null hypothesis of constancy. While other approaches, e.g., the 
Kolmogorov-Smirnov test, maximum-likelihood methods, and 
Bayesian methods are potentially more sensitive, the^ 2 -test was 
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Table 5. Event selection for source products. 



pn 



MOS 



PATTERN^: 
FLAG" for spectra: 
FLAG" for time- series: 
energy range: 
GTIs for spectra: 
GTIs for time-series: 



<4 

FLAG = 

(FLAG & Oxfffffef) = 

0.2*-12keV 

instrumental and background flare GTIs 

instrumental GTIs 



< 12 

(FLAG & Oxfffffeff) = 

(FLAG & 0x766ba000) = 

0.2^- 12keV 

instrumental and background flare GTIs 

instrumental GTIs 



GTIs for variability test: merged instrumental and background flare GTIs merged instrumental and background flare GTIs 

" column in the event lists 

* the range 0.2 - 0.35 keV is set to bad in the spectra 



chosen here as being a simple, robust indicator of variability. The 5.3. Limitations of the automatic extraction 



fundamental formula for^f is 



X 



S 



(yt - Yd 2 



where y, is the /th data value, Y, the model at this point, and cr, the 
uncertainty. In the present case, the model T,, which incorporates 
the null hypothesis that the source flux is constant over time, is 
constructed as follows: 

Y t - fsrej A src At [0 sre + 0bkg,,] , (3) 

where fi are exposure values, A is the collecting area, At is 
the time-series bin duration, and is a (bin-averaged) 'flux' in 
counts per unit time per unit area. 

The problem now is that a priori the expectation values fibkgj 
for the background time-series is not known - they must be esti- 
mated, with as low an uncertainty as possible, by forming a back- 
ground time-series in an (ideally) fairly large area which is suf- 
ficiently far from the source to avoid cross-contamination. Also, 
the average source flux S1C is not known, which must also be 
estimated from the (necessarily noisy) data at hand. After some 
algebra it can be shown that the best estimate for Y-, is given by 



r; = 
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(4) 



where b t are the measured background counts. The first term of 
equation 3 represents a constant, unweighted time-average of the 
background-subtracted source counts, derived from the whole 
light curve, while the second term reflects the background ex- 
pected in the source aperture for time-bin, i. 

The cr values in the^ 2 sum present a problem. In the Pearson 
formula appropriate to Poissonian data, <x 2 is set to Y,. If we 
simply substitute Y- for Yj here, the resulting^ 2 values are found 
via Monte Carlo trials to be somewhat too large. This is because 
the use of the random background variate b\ in Eq.|4]introduces 
extra variance into the numerators of the sum. A formula for <x 
which takes this into account is 

V^bkg Jbkg,;/ 

For each exposure used, the pipeline generated a 
background-subtracted source time-series and the corre- 
sponding background time-series (corrected for exposure, 
cosmic rays, and dead time), together with the graphical 
representations of the data and of its power spectrum. The 
^-statistics and probabilities are conveyed by header keywords. 
Some example total-band time-series from these products that 
highlight the range of source variability present in the 2XMM 
catalogue are shown in Fig. [7] 



As with any automated extraction procedure, a few source prod- 
ucts suffer from problems such as low photon statistics, low 
numbers of bins, background subtraction problems, and contam- 
ination. 

Spectra with few bins can arise for very soft sources where 
the total-band counts meet the extraction criteria but the bulk 
of the flux occurs below the 0.35 keV cut-off (Sect. 15.11 ). This 
can also occur if the extraction is for an exposure with a shorter 
exposure time than those used in the detection stage, especially 
if the detection was already close to the extraction threshold. 
Similarly, background over-estimation in the exposure (or un- 
derestimation in the original detection exposure) can result in 
fewer source counts compared to those determined during the 
detection stage, yielding poorer statistics and low bin numbers 
for the time-series and spectra. This can occur when spatial 
gradients across the background region are imperfectly charac- 
terised, e.g., where the source lies near strong instrumental fea- 
tures such as OOT events, where there are marked steps in the 
count-rate levels between adjacent noisy and non-noisy CCDs, 
or where contaminant source exclusions are biased to one side 
of a background region that overlaps the wings of a very bright 
source or bright extended emission. In many cases the automatic 
(Sect. 17.31 ) as well as manual flag settings (Sect. 17.41 ) indicate 
whether source products are likely to be reliable. 

Contamination of the source extraction region (e.g., by an- 
other source, OOT events, or single reflections) can also cause 
problems if the contamination is brighter than or of comparable 
brightness to the extracted source. The nearest-neighbour col- 
umn can act as an initial alert in such cases - 19% of the cata- 
logue sources with spectra have neighbouring detections (of any 
brightness) within 28" (i.e., the extraction radius). 

The extraction process and exposure corrections are op- 
timised for point sources. Absolute fluxes in source-specific 
products of extended sources, therefore, may not be reliable. 
However, relative measures such as variability and spectral line 
detection should still be indicative. 



5.4. Known processing problems 

A few products are affected by known processing problems: 

(i) When the usable background region is very small, the 
background area calculation becomes imprecise and results in 
an inaccurate background-subtracted source spectrum. This can 
occur with bright sources in MOS W2 and W4 partial window 
modes where most of the background region lies outside the 
110" X 1 10" window or in crowded areas where the source-free 
area is markedly reduced. In the former case the source is usually 
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Fig. 7. Example auto-extracted 2XMM time-series. Sources are 
serendipitous objects and the data are taken from the pn un- 
less otherwise stated. Panels: a) MOS1 data for Markarian 335 
(Seyfert I - target); b) MOS1 data showing the decay curve 
of GRB 050326 (target); c) X-ray flares from a previously un- 
known coronally active star; d) time-series of the emission from 
a relatively faint cluster of galaxies, showing no significant vari- 
ability (target); e) time-series of the obscured Galactic binary 
IGR 16318-4848 (target; Ibarra et al. 2007); f) previously un- 
known AM Her binary showing several phase-stable periodic 
features (Vogel et al. 2007); g) highly variable AND periodic 
object, likely to be a cataclysmic or X-ray binary (Farrell et 
al. 2008) - the binning results in poor sampling of the intrin- 
sic periodic behaviour; h) source showing clear variability but 
not flagged as variable in the catalogue (the probability of vari- 
ability falls below the threshold of 10~ 5 ). These last two cases 
highlight the sensitivity of the variability characterisation on the 
time bin size. 



bright enough that background subtraction has negligible impact 
and so does not need to be performed. 

(ii) Attitude GTIs were not included in the extraction crite- 
ria, and occasionally the source was significantly displaced with 
respect to the aperture as defined by the detection image (in ex- 
treme cases, off the detector). This will affect the calculation of 
count rates in the spectra and the variability measurements for 
the time-series. 

(iii) Occasionally the light curve exposure correction failed 
(i.e., no time-series were produced) or light curves were inade- 
quately corrected for strong background variations across CCDs 
(which can cause spurious variability detection). The latter cases 
are confined to very bright extended sources and are mostly as- 
sociated with spurious detections. 



(iv) Neither spectra nor time-series are corrected for pileup 
(nor are the source count rates in the catalogue). Due to the dif- 
ficulties in detecting and quantifying pileup no attempt has been 
made to flag this effect. 



6. External catalogue cross-correlation 

As part of the XMM-Newton pipeline, the Astronomical 
Catalogue Data Subsystem (ACDS) generated products holding 
information on the immediate surrounding of each EPIC source 
and on the known astrophysical content of the EPIC FOV, high- 
lighting the possible non-detection of formerly known bright X- 
ray sources as well as indicating the presence of particularly im- 
portant astrophysical objects in the area covered by the XMM- 
Newton observation. 

In addition to SimbacfH and NECPI 202 archival catalogues 
and article tables were queried from VizieQ They were selected 
on the basis of their assumed high probability to contain the 
actual counterpart of the X-ray source. Basically all large area 
"high density" astronomical catalogues were considered, namely 
the SDSS-DR3 (Abazajian et al. 2005) , USNO-A2.0 (Monet 
et al. 1998), USNO-B1.0 (Monet et al. 2003), GSC 2.2 (STScI 
2001), and APM-North (McMahon et al. 2000) catalogues in 
the optical, the IRAS (Joint Science WG 1988; Moshir et al. 
1990), 2MASS (Cutri et al. 2003), and DENIS (DENIS con- 
sortium 2005) catalogues in the infrared, the NVSS (Condon et 
al. 1998), WISH (de Breuck et al. 2002), and FIRST (Becker 
et al. 1997) catalogues at radio wavelengths, and the main X- 
ray catalogues produced by Einstein (2E; Harris et al. 1994), 
ROSAT: RASS bright and faint source lists (Voges et al. 1999, 
2000), RBS (Schwope et al. 2000), HRI (ROSAT Team 2000), 
PSPC (ROSAT 2000), and WGACAT (White et al. 2000) cat- 
alogues of pointed observations), and XMM-Newton (1XMM; 
XMM-SSC, 2003). Also included were large lists of homoge- 
neous objects (e.g., catalogues of bright stars, cataclysmic vari- 
ables, LMXBs, Be stars, galaxies, etc.). The full list of archival 
catalogues queried is included as one of the pipeline products. 

The XMM-Newton detections were cross-correlated with the 
archival entries taking into account positional errors in both the 
EPIC and the archival entries. The list of possible counterparts 
did not provide additional information on the relative merits of 
the cross-correlation or on the probability that the given archival 
entry was found by chance in the error circle of the X-ray source. 

The cross-correlation was based on the dimensionless vari- 
able: 

, Aa 2 A5 2 

r = — + — 

< n 

with cr 2 — cr 2 + cr 2 and cr 2 — <t 2 + <j 2 , where cr at and cr Sx are 
the standard deviations in RA and Dec of the X-ray source posi- 
tion and cr ao and cr s> the corresponding errors on the position of 
the archival catalogued object. The error on the X-ray position 
is the quadratic sum of the statistical error with the additional 
error which depends on the effectiveness of the astrometric cor- 
rection (cf. Sect. [43}. Positional errors of the archival entries 
were either read from the respective catalogue or fixed accord- 
ing to guidance in the relevant catalogue literature. In all cases, 
the significance of the error was rescaled to the lcr-level. 



17 The SIMBAD Astronomical Database (Wenger et al. 2000). 

18 The NASA/IPAC Extragalactic Database. 

19 The VizieR Service at CDS (Ochsenbein et al. 2000). 
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The probability density distribution of position differences 
between the X-ray source and its catalogue counterpart due to 
measurement errors is a Rayleigh distribution. Hence, the prob- 
ability of finding the X-ray source at a distance between r and 
r + 6r from its archival counterpart is: 

6p(r\id) = r-e ( - r2/2) 6r, 

with a cumulative distribution function: 



f 

Jo 



6p(r\id) = (1 - e (_r2/2) ) 



Thus, lists of counterparts with r(cr) < 2.146,3.035, and 3.439 
are 90%, 99%, and 99.73% complete, respectively. Computing 
the actual reliability of the identification requires a careful cal- 
ibration of the density of catalogue sources and of the likeli- 
hood ratio method applied; in the near future, such reliabili- 
ties will be provided for candidates found in the main archival 
catalogues. However, in the absence of such information at the 
pipeline level, it was decided, for completeness to list all pos- 
sible identifications having positions consistent with that of the 
X-ray source at the 99.73% confidence level, corresponding to 
3cr. 

The ACDS results are presented in several interconnected 
HTML files (together with copies in FITS format). Graphical 
products are 1) a plot with the position of all quoted archival 
entries on the EPIC merged image, 2) an overlay of the position 
of the X-ray sources detected in the EPIC camera and contours 
of the EPIC merged image on a ROSAT all-sky survey image, 
and 3) finding charts based on sky pixel data provided by the 
CDS Aladin image server (Bonnarel et al. 2000). 

7. Quality evaluation 

As part of the quality assurance for the data processing, a number 
of procedures, both automated and manual, were performed on 
many of the data products to take note of intrinsic problems with 
the data as well as to detect software issues. Particular emphasis 
was given to potential problems with the source detection and 
characterisation, and quality flags were set accordingly. 



7. 1 . Visual screening of data products 

The overall visual screening included data products from all 
three instrument groups (EPIC, RGS, OM) as well as those from 
the external catalogue cross-correlation (cf. Sect. [6]). Only prod- 
ucts that could be conveniently assessed were inspected using a 
dedicated screening script, that is, most HTML pages, all PNG 
images and all PDF plots (as representatives of data from the 
FITS files), all EPIC FITS images and maps (including source- 
location overlays), and the mosaiced OM FITS images with 
source overlay. For each observation a screening report with 
standardised comments was created, recording data and process- 
ing problems (see, for example, Sect. I5.4b . and made available 
via the XSA. 

As a result of the visual screening, two otherwise eligible ob- 
servations (obtained for experimental mode tests) were excluded 
from the catalogue since the tested mode was not properly sup- 
ported by the processing system and the source parametrisation 
was considered to be unreliable. 



7.2. Potential source detection problems 

Intrinsic features of the XMM-Newton instrumentation com- 
bined with some shortcomings of the detection process have 



given rise to detections that are obviously spuriouo- Possible 
causes range from bright pixels and segments to OOT events (in 
the case of pileup), RGA scattered light, single reflections from 
the mirrors, and optical loading (cf. Appendix lAl and Fig.|4]a). 
In cases where the spatial background varied rapidly (e.g., PSF 
'spikes', filamentary extended emission, edges of noisy CCDs), 
the spline background map may deviate from the true back- 
ground. This could potentially have given rise to spurious source 
detections and could also have affected the measured parameters 
(including time-series and spectra) of real sources. 

Extended sources were particularly difficult to detect and 
parametrise due to their (often) filamentary or non-symmetric 
structure as well as the maximum allowed extent in fitting (80", 
Sect, l4-.4-.4-b . This often led to multiple detections of a large or 
irregular extended emission region. On the other hand, multiple 
point sources (e.g., in a crowded field) might also be detected 
as extended (due to computational restrictions no attempt was 
made to distinguish more than two overlapping/confused point 
sources). See Sect. |9.9| and Fig. [l4]for a discussion of extended 
sources and some examples. 

7.3. Automated quality-warning flags for detections 

Some of the source detection problems could be identified and 
quantified so that the processing software could set automated 
quality warning flags in the source lists. For each detection, four 
sets of flags (one per camera plus a summary set covering all 
cameras), each containing twelve entries, were written into the 
observation source list. Nine of the flags in each set were popu- 
lated based on other key quantities available in the same source 
list. The meaning of these flags is summarised in Table [6] The 
default value of every flag was False; when a flag was set it 
means it has been changed to True. For each detection, Flags 
2-7 were set in a common fashion across all four sets. Flags 
1, 8, and 9 are camera-specific, but any set to True were also 
reflected in the summary set. 

The criteria used to set the flags were determined largely em- 
pirically from tests on appropriate sample data-sets (cf. Fig.|4]b 
for some examples). Flags set to True should be understood 
mainly as a warning: they identify possible problematic issues 
for a detection such as proximity to a bright source, a location 
within an extended source emission, insufficient detector cover- 
age of the PSF of the detection, and known pixels or clustering 
of pixels that tend to be intrinsically bright at low energies. In all 
these cases the parameters of a real source may be compromised 
and there is a possibility that the source is spurious. 

Extended sources near bright sources and within larger ex- 
tended emission are most likely to be spurious and have been 
flagged as such. In addition, extended detections triggered by 
hot pixels or bright columns can be identified since their like- 
lihood in one band (of one camera) is disproportionally higher 
than in the other bands and cameras. However, no attempt has 
been made to flag spurious extended detections in the general 
case, that is, in areas where the background changes consider- 
ably on a small spatial scale and the spline maps cannot ade- 
quately represent this. At the same time, no point sources have 
been specifically flagged as spurious (see Sect. 17.41 regarding 
manual flagging) though they are often caused by the same fea- 
tures as the spurious extended detections. The spatial density of 
real point sources is, in general, much higher than for extended 



20 Spurious detections caused by the background noise (as char- 
acterised by their likelihood) are not discussed in this section, see 
Sects.fSlEl 
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Flag Description 



Definition for flag to be set True (cf. Notes) 



1 Low detector coverage 

2 Near other source 

3 Within extended emission 

4 Possible spurious extended detection near bright source 

5 Possible spurious extended detection within extended emission 

6 Possible spurious extended detection due to unusual large 
single-band detection likelihood 

7 Possible spurious extended detection 

8 On bright MOS 1 corner or bright low-gain pn column 

9 Near bright MOS 1 corner 

10 Not used 

1 1 Within region where spurious detections occur 

12 Bright ('originating') point source in region where spurious de- 
tections occur 



<0.5 



r<65 



AND r„„„ = 10" AND r„ 



■ 400" 



r < 3 ■ E AND r max = 200" 

Detection is extended AND Flag 2 is set AND c epic > 1000 

Detection is extended AND r < 160" AND fraction of rate compared 

with causing source is < 0.4 

Detection is extended AND fraction of detection likelihood per camera 

and band compared with the sum of all is > 0.9 

At least one of the flags 4, 5, 6 is set 

Source position is located on one of the affected pixels 

Source position within r/ = 60" of a bright corner pixel 

Set manually 
Set manually 



Notes: m is the detector coverage of the detection weighted by the PSF; r is a radial distance in arcseconds from the 'originating' source within 
which all detections receive this flag; R epic is the EPIC source count rate in ct/s of the 'originating' source; E is the extent parameter (core radius) 
of the 'causing source' in arcseconds (cf. Sect. 14.4.41 ; c epjc is the EPIC source counts of the 'originating' source; Tj is the radius used for source 
PSF fitting in arcseconds. 



sources and the reliability of such a 'spurious' flag would be low. 
Instead, Flags 2, 3, and 9 can be used as a warning that such a 
source could be spurious. 

7.4. Manual flag settings for detections 

In addition to the automated quality flags, a more rigorous visual 
screening of the source detection was performed for the EPIC 
fields to be used in the catalogue. The outcome of this process 
was reflected in two flags (11 and 12) as described below and 
summarised in Table [6] 

Images of each field, with source overlay, were inspected vi- 
sually and areas with likely spurious detections were recorded 
(as ds9-regions; Joye & Mandel 2003). Such regions could be 
regular (circle, ellipse, box) or irregular (polygon); in cases 
where only a single detection was apparently spurious a small 
circle of 10" radius was used, centred on this detection. It should 
be stressed that these regions, except for the latter case, could 
include both suspected spurious and real detections. In many 
cases (especially at fainter fluxes) it was impossible to visually 
distinguish between a real source and a spurious detection that 
was caused by artefacts on the detector or by insufficient back- 
ground subtraction. In addition, the effect of such features on 
the parameters of a nearby real source has not been investigated 
in detail. For example, single reflections or the RGA scattered- 
light features were not included in the background maps and 
may therefore have affected the source parameters. On the other 
hand, as the source parameters are derived by the fitting pro- 
cess in order of decreasing source brightness, the parameters of 
fainter sources take the PSF of nearby bright sources into ac- 
count (Sect. @3). 

The ds9-regions were converted to EPIC image masks where 
the bad areas have the value zero and the rest of the field has 
the value one. These masks are available as catalogue prod- 
ucts (Sect. [Toll; they can be combined with the camera detection 
masks to study, for example, the sky coverage. 

The masks were used to flag sources within the masked areas 
with Flag 11. In many cases, the so-called 'originating' source 
(a bright point source, cf. Flag 2, or a large or irregular extended 
source, cf. Flag 3) was located within the masked region. Though 
the brightest source was fitted before the fainter ones, the contri- 



bution of the faint sources to the fit of the bright source is con- 
sidered to be negligible (Sect. 14.4.3b . Hence, the 'originating' 
point source was identified by setting its Flag l^cjto distinguish 
it from the other detections with Flag 1 1 in that particular ds9- 
region, the parameters of which may be affected by the presence 
of the indicated bright source due to imperfections in the PSF 
used. In the case of bright extended sources, however, the situ- 
ation was different: the extent parameter was obviously affected 
by nearby spurious detections, and consequently the brightness 
was underestimated. Flag 12 was therefore only set for point 
sources. 



7.5. Quality summary flag 

For easier use of the quality-flag information, the catalogue gives 
a summary flag which combines the flags described above (11 
per camera per detection) to give a single, overall quality indica- 
tion for each detection. Its five possible values are as follows (in 
order of increasing severity): 

0: There are no indications of problems for this detection; none 
of the flags [1-12] for the three cameras [pn,Ml,M2] are set 
to True. This value can be used to obtain the cleanest possi- 
ble samples (but possibly at the expense of omitting some 
otherwise acceptable detections). (71% of all detections.) 

1: The source parameters are considered to be possibly com- 
promised; at least one of the warning flags [1,2,3,9] for any 
of the cameras [pn,Ml,M2] is True. This value can be used 
to accept detections for further potential use, but they should 
be subjected to careful scrutiny dependent on the specific ap- 
plication. (9% of all detections.) 

2: The detection may be spurious but was not recognised as 
such during visual inspection; at least one of the auto- 
mated 'spurious detection' flags [7,8] for any of the cameras 
[pn,Ml,M2] is True but the manual flag [11] is False. This 
value can be used to accept detections for further potential 
use, but they should be subjected to careful scrutiny depen- 
dent on the specific application. (1% of all detections.) 



21 Note that Flag 12 was not set when the source appeared to be split 
into two, cf. Sect. 14.4.41 or when a close-by fainter detection appeared 
to be of comparable brightness. 
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3: The detection lies in a region where spurious detections oc- 
cur but which could not be dealt with in an automated way; 
the manual flag [11] is True but the automated 'spurious de- 
tection' flags [7,8] of all the cameras [pn,Ml,M2] are False. 
Detections with this value should be used only after very 
careful scrutiny, as they may well be spurious, unless flag 12 
is True, in which case the detection (and possibly its param- 
eters) may well be valid, as it is likely to be a strong source. 
(15% of all detections, where Flag 12 was set for 600 detec- 
tions.) 

4: The detection lies in a region where spurious detections oc- 
cur and is flagged as likely spurious; the manual flag [11] 
is True and any of the automated 'spurious detection' flags 
[7,8] for any of the cameras [pn,Ml,M2] is also True. It is 
recommended that detections with this value should not nor- 
mally be used. (4% of all detections.) 

Flag 12 was not included in the summary flag, selecting by 
Flag 12 as well can provide a clean as well as a more complete 
sample, as noted above, since this flag is usually given to reason- 
ably bright point sources. 

The screening flags also offer a means of avoiding source- 
specific data products with possible problems, noting that of 
all detections with products a significant fraction have summary 
flag > 3 indicating potential issues with the spectra and/or time 
series. 



7.6. Overall observation classification 

The summary flag assigned to each detection in the catalogue 
provides an overall classification of each detection included in 
the catalogue. On the other hand, since about half of all obser- 
vations in the catalogue are little affected by artefacts and back- 
ground subtraction problems, an observation classification of- 
fers the possibility of selecting good quality fields rather than 
good quality detections. This classification is based on the frac- 
tion of area masked out in the flag mask (Sect. 17.4b as compared 
to the total area used in the source detection (from the combined 
EPIC detection mask) for that observation. Six classes of obser- 
vations were identified. They are listed in Table [7] together with 
the percentage of observations affected, the fractional area, and 
the approximate size of the excluded region (note that the flag 
mask may comprise several regions in various shapes). 



8. Catalogue compilation 

The 2XMM catalogue is a catalogue of detections. As such, 
every row in the 2XMM catalogue represents a single detec- 
tion of an object from a separate XMM-Newton observation. 
The construction of the 2XMM catalogue consists of two main 
steps. The first involves the aggregation of the data of individ- 
ual detections from the separate observation source lists into 
a single list of detected objects, adding additional information 
about each detection and meta-data relating to the observation 
in which the detection was made. The second step consists of 
cross-matching detections, identifying resulting unique celes- 
tial objects and combining or averaging key quantities from the 
detections into corresponding unique-source values. Ultimately, 
the ensemble of data for both detections and unique sources be- 
comes the catalogue. 

The primary source of data for the catalogue was the set 
of 3491 EPIC summary source list files from the maximum- 
likelihood source-fitting processes (Sect. 14.4.3b . Additional in- 
formation incorporated into the catalogue for each detection in- 



cludes the detection background levels, the variability informa- 
tion (from the EPIC source time-series files; see below) and 
the detection flags from the automatic flagging augmented by 
the manual data screening process (see Sects. 17.41 and |7.5I >. 
Ancillary information added to the catalogue entries also in- 
cludes various observation meta-data parameters (e.g., observa- 
tion ID, filters and modes used) and the observation classifica- 
tion determined as part of the data screening process (Sect. lTTol i. 
In the final catalogue table each detection is also assigned a 
unique detection number. 

The measured and derived parameters of the detections taken 
from the pipeline product files are reflected in the 2XMM cata- 
logue by a number of columns described in Appendix ID. ll - |P~6l 
For the variability information for detections (Appendix ID. 6t . 
the variability identifier was set to True for a detection if at least 
one of the time-series for this detection (derived from all ap- 
propriate exposures) had a ^-probability < 10" 5 based on the 
null hypothesis that the source was constant (cf. Sect. 15.21 ). The 
probability threshold was chosen to yield less than one false trig- 
ger over the entire set of time-series. Where the flag was set, the 
camera and exposure ID with the lowest^ 2 -probability were also 
provided for convenience. No assessment of potential variability 
has been made between observations for those sources detected 
more than once. 



8.1. Unique celestial sources 

XMM-Newton observations can yield multiple detections of the 
same object on the sky where a particular field is the subject of 
repeat pointings or because of partial overlaps from dedicated 
mosaic observations or fortuitous overlaps from unrelated point- 
ings. As such, the catalogue production process also sought to 
identify and collate data for all detections pertaining to unique 
sources on the sky, providing a unique-source indexing system 
within the catalogue. In parallel, the catalogue provides a num- 
ber of derived quantities relating to the unique sources computed 
from the constituent detections. 

To identify unique sources from multiple detections, reliable 
estimates of the position error, cr pos , of each detection are essen- 
tial. The best estimate of the position error was found to be 



(Try 



-J 



(5) 



where <x sys is the additional error (Sect. |4~5l see also footnote[T2l 
and <x stat is the statistical centroid uncertainty measured from the 
source-fitting stage (Sect. l4~4~3l . 

Two detections from different observations with respective 
position errors of cr\ and cri were assumed to be potentially as- 
sociated with the same celestial source if their separation is 

r sep < 3(cri + cr 2 ) , 

with 7" as an upper-limit. The 7" limit to position offsets in the 
matching process was determined empirically as the best value 
to prevent spurious matches (dominated by a few weak extended 
sources with large position errors) without having a significant 
effect on the number of genuine matches. A match was, how- 
ever, rejected if r sep > 0.9afi or r sep > 0.9^2 where d\ and d% 
are the distances from the detection to its nearest neighbour in 
the same observation. This latter provision means that no two 
distinct sources from the same image should be matched. No 
quality flag information was used in the matching process. 

Using these constraints, the detection table was cross- 
correlated with itself to find all possible pairs of detections hav- 
ing error-circle overlaps. Some detections were found to have as 
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class %age of 2XMM obs definition 



comment 






38% 


bad area = 0% 


1 


12% 


0% < bad area < 0.1% 


2 


10% 


0.1% < bad area < 1% 


3 


25% 


1% < bad area < 10% 


4 


10% 


10% < bad area < 100% 


5 


5% 


bad area = 100% 



no region has been identified for flagging 

^ 3 single detections 

circular region with 40" S r S 60" 

circular region with 60" 5J r S 200" 

circular region with r ^ 200" 

the whole field is flagged as bad 



many as 3 1 such overlaps, since a few areas of sky were observed 
this many times (generally calibration observations). Resolving 
this list into a set of unique celestial sources required some ex- 
perimentation because of potential ambiguity in a few crowded 
or complex fields. The extreme scenarios were 1) to assume a set 
of detections was associated with a unique source only if they all 
overlapped each other - this was considered too conservative; 2) 
to assume that a set of detections constituted a unique source if 
each member overlapped at least one other member - this was 
deemed overly generous, i.e., it would have included a few pairs 
of detections whose mutual separations would be incompatible 
with coming from a single source. The algorithm adopted gave 
priority to those detections with the highest number of overlaps 
(because they were likely to be near the true source centre) and, 
this number being equal, to count-rate agreement. The list of 
overlapping detections was therefore sorted in descending order 
of the number of overlaps and the EPIC total-band count rate 
and then processed in that order. Each detection was associated 
with all its overlapping detections, except those which had al- 
ready been removed from the list by having been associated with 
another (better connected or stronger) detection. In the final cat- 
alogue the number of detections which might have been asso- 
ciated with a source different from the one actually assigned to 
them, given a different order of processing, was about one hun- 
dred, which was significantly lower than the figure from various 
alternative algorithms. These ambiguous detections were almost 
all from observations which the screening process flagged as 
unreliable, suggesting that further refinements to the algorithm 
would have been of little practical value. 

The algorithm adopted for the identification of unique 
sources appears to be reliable in the great majority of cases, but 
there are known to be a few confused areas where the results are 
likely to be imperfect. The most common cause is where real dif- 
fuse or bright objects give rise to (generally spurious) additional 
detections which happen to approximately coincide spatially in 
different observations. In most cases it is likely that the sources 
will have received a manual flag. Incorrect matching can also po- 
tentially occur where centroiding is adversely affected by pileup 
or optical loading, where one or more contributing observations 
have significant attitude errors which could not be astrometri- 
cally rectified (Sect. 14.51 , or where a real source is located close 
to another detection associated with an artefact such as residual 
OOT events from a strongly piled-up source elsewhere in the im- 
age. Where pileup or artefacts are involved, affected sources may 
have been assigned automatic or manual flags anyway. It should 
be emphasised, however, that flag information is not used in the 
source matching process. Based on the extensive visual inspec- 
tion, incorrect detection matching is believed to be extremely 
rare (< 200 detections affected). Inevitably, in a few cases, the 
matching process fails to match some detections that belong to- 
gether. 

A number of quantities for unique sources are included in the 
2XMM catalogue, based on error-weighted merging of the con- 



stituent detection values (see Appendix ID. 7I >. The IAU name of 
each unique source was constructed from its coordinates. Note 
that an individual detection is completely specified by its IAU 
name and its detection identifier. The unique-source data were 
augmented with five quantities that were not based on error- 
weighted merging: 1 ) the unique-source detection likelihood was 
set to the highest EPIC total-band detection likelihood, i.e., it re- 
flects the strongest constituent detection of a unique source. 2) 
A unique-source extent likelihood was computed as the simple 
average of the corresponding EPIC detection values. 3) The re- 
duced x 2 -probability for the variability of a unique source was 
taken as the lowest of the detection values, indicative of the de- 
tection with the highest likelihood of being variable, where vari- 
ability information was available. 4) Where variability informa- 
tion existed for any of the constituent detections, a unique-source 
variability identifier was set to True if any were True and to False 
if none were True. Where no variability information was avail- 
able, the unique-source flag was set to Undefined. 5) A unique- 
source summary flag took the maximum of the detection sum- 
mary flag values (Sect. 17.51 1. i.e., reflecting the worst-case flag 
from any of the detections of the source. 

The 2XMM catalogue was also cross-correlated against the 
1XMM and 2XMMp catalogues during the construction pro- 
cess. For each unique 2XMM source, the most probable match- 
ing 1XMM counterpart and 2XMMp counterpart were identified 
and listed in the 2XMM catalogue. The matching algorithm em- 
ployed was similar to the one described for identifying unique 
sources but the maximum positional offset between the new cat- 
alogue and the older ones was set at 3". This was a rather conser- 
vative value but since a number of sources in 1XMM, especially, 
have positional errors greater than this, it ensures that there are 
very few incorrect matches or ambiguous cases. 

This resulted in ~ 88% of all 2XMMp sources having a 
match with 2XMM sources. Apart from those lying outside the 
3" matching circle, non-matched sources are found to be either 
spurious, at the detection limit, or the observation was not in- 
cluded in 2XMM. Comparison with 1XMM is not straight for- 
ward due to the differences in the detection scheme (e.g., the 
source detection in 1XMM was done per camera) and likelihood 
cutoffs. Note, though, that 1XMM comprises only 585 of the 
2XMM fields. 



9. Catalogue characterisation & results 

9.1. Overall properties 

The catalogue contains 246 897 detections drawn from 3491 
public XMM-Newton observations (Fig.[TJ. These detections re- 
late to 191 870 unique sources. Of these, 27 522 X-ray sources 
were observed more than once; some were observed up to 31 
times in total due to the fact that many sky regions are covered 
by more than one observation. Of the 246 897 X-ray detections, 
20837 are classified as extended. Table [8] shows the number of 
detections and unique sources per camera and energy band (split 
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Table 8. Numbers of detections with likelihood L > 10 in the 
2XMM catalogue. 



Camera 


Energy 


Point 


Ext'd 


Unique point 


Unique ext'd 




band (keV) 


source 


source 


source 


source 


pn 


0.2- 


0.5 


38074 


4319 


30811 


3843 


pn 


0.5- 


1.0 


63248 


7457 


50639 


6714 


pn 


1.0- 


2.0 


68197 


6217 


55035 


5555 


pn 


2.0- 


4.5 


37511 


3604 


30702 


3167 


pn 


4.5- 


12.0 


11144 


1586 


8682 


1337 


Ml 


0.2- 


0.5 


20841 


3392 


15887 


2958 


Ml 


0.5- 


1.0 


40965 


6734 


30998 


5892 


Ml 


1.0- 


2.0 


52569 


6754 


40062 


5882 


Ml 


2.0- 


4.5 


34230 


4452 


26710 


3858 


Ml 


4.5- 


12.0 


7818 


1825 


5776 


1547 


M2 


0.2- 


0.5 


20626 


3485 


15718 


3012 


M2 


0.5- 


1.0 


42488 


7045 


32055 


6149 


M2 


1.0- 


2.0 


56060 


6997 


42624 


6107 


M2 


2.0- 


4.5 


36760 


4703 


28538 


4080 


M2 


4.5- 


12.0 


8546 


2008 


6265 


1716 



into point sources and extended sources); a likelihood threshold 
L > 10 has been applied but no selection of detection flags has 
been made. 

The catalogue contains detections down to an EPIC likeli- 
hood of 6. Around 90% of the detections have L > 8 and ~ 82% 
have L > 10. Simulations demonstrate that the false detection 
rate for typical high Galactic latitude fields is ~ [2, 1,0.5]% for 
detections with L > [6,8, 10] respectively (Sect. [9~4l . We note 
that the source detection in 2XMM has a low degree of incom- 
pleteness L < 10. This arises from the fact that the first stage 
of the source detection (Sect. |4.4.2t requires that each detection 
have L > 5. As this first stage of the processing is relatively 
crude, the incompleteness primarily arises from this preselection 
of low significance detections. 

The 2XMM catalogue is intended to be a catalogue of 
serendipitous sources. The observations from which it has been 
compiled, however, are of course pointed observations which 
typically contain one or more target objects chosen by the orig- 
inal observers, so the catalogue contains a small fraction of tar- 
gets which are by definition not serendipitous. Appendix IClpro- 
vides details of the target identification and classification. From 
this analysis we find that around 2/3 of the intended targets are 
unambiguously identified in their XMM-Newton observations 
but, allowing for multiple detections, only ~ 1400 targets are 
plausibly associated with 2XMM catalogue sources. This means 
that < 1 % of 2XMM sources are the target of the observation, al- 
though in a few observations (e.g., nearby galaxies) the number 
of sources associated with the target can clearly be much greater. 

More generally the fields from which the 2XMM catalogue 
is compiled may also not be representative of the overall X- 
ray sky. The classification of the XMM-Newton observations 
( Appendix ICl and Table ??) is relevant to avoiding potential se- 
lection bias in the use of the catalogue. 

9.2. Sky coverage and survey sensitivity 

To compute the effective sky coverage, the sky was notion- 
ally covered by a grid of pixels using the HEALPix projection 
(Gorski et al. 2005). Adequate resolution was obtained using 
pixels ~ 13" across. For each observation included in 2XMM 
the exposure times were computed for each HEALPix pixel tak- 
ing into account the exposure map for each observation (i.e., the 
actual coverage taking into account observing mode, CCD gaps, 



telescope vignetting, etc.). From this analysis we find that in to- 
tal the catalogue fields cover a sky area of more than 500 deg 2 . 
The non-overlapping sky area is ~ 360 deg 2 (~ 1% of the sky). 

The sensitivity of the 2XMM survey catalogue was estimated 
empirically using the method of Carrera et al. (2007). The algo- 
rithm presented in their Appendix A was used to compute sen- 
sitivity maps for each instrument and energy band, using data 
from the exposure maps and background maps from each obser- 
vation. Using a grid of HEALPix pixels in a similar way to that 
outlined above, the limiting flux of the most sensitive observa- 
tion of each part of the sky was estimated. Figure [8] shows the 
sky area against limiting flux for each EPIC camera and energy 
band separately. This analysis provides a relatively robust esti- 
mate of the total sky area of the 2XMM catalogue for each of 
the three EPIC cameras, although it does not take into account 
those sky regions which are effectively useless for serendipitous 
source detection due to the presence of bright objects or certain 
instrumental artefacts (see discussion in Sect. 13.11 and Fig. [4j} 
and c). E3 These area-flux plots computed for L > 10 show that 
the effective sky coverage for the MOS2 camera is ~ 370 deg 2 
(for the MOS 1 camera it is ~ 360 deg 2 due to the loss of one 
of the MOS1 CCDs in March 2005), whilst for the pn camera 
the area is ~ 330 deg 2 , due primarily to reduced or zero imag- 
ing sky area provided by some of the pn observing modes. The 
limiting fluxes vary between camera and energy band. For the 
pn camera which provides the highest sensitivity, the minimum 
detectable fluxes in the soft (0.5 - 2 keV), hard (2-12 keV) and 
hardest (4.5 - 12 keV) bands at 10% sky coverage are ~ [2, 15, 
35] xlO _I5 ergcm _2 s _I , respectively. The fluxes for >90% sky 
coverage (i.e., close to complete coverage) in these bands are ~ 
[1, 9, 25] xl0 _14 ergcm _2 s _1 respectively. 



9.3. Flux and count distributions 

The distribution of fluxes for the 2XMM catalogue detections is 
shown in Fig. [9] This figure illustrates that the typical soft-band 
flux for the catalogue sources is ~ 5 x 10~ 15 erg cirT 2 s and is 
~2 x 10~ 14 erg crrT 2 s in the hard and total bands. These values 
correspond quite closely to the fluxes of the sources which dom- 
inate the cosmic X-ray background (where the slope of the ex- 
tragalactic source counts breaks), demonstrating the importance 
of 2XMM in providing large samples at these fluxes. 

Also shown in Fig.|9]is the distribution of total counts in the 
combined EPIC images for the same sample of 2XMM detec- 
tions. As expected the distribution is dominated by low count 
sources, with the peak lying at ~ 70 counts. This plot also il- 
lustrates the effect of the targets of the XMM-Newton fields 
themselves which only contribute significantly, not surprisingly, 
above -2000 EPIC counts. 

We note that it would be possible to combine the survey 
sensitivity curves discussed in Sect. 19.21 and the flux distribu- 
tions discussed here to construct the source counts (i.e., the 
log N - log S relationship) for the 2XMM catalogue. In practice, 
however, the results of this exercise would have limited value 
due to the large uncertainties in the correct area-sensitivity cor- 
rections for the substantial number of fields included in 2XMM 
which contain, for example, bright objects or are subject to prob- 
lematic instrumental effects. A separate paper, Mateos et al. 
(2008), presents the logN - log S relationship and results for 



22 Figure[8]also does not take into account the effects of Poisson noise 
which produces a probability distribution for source detectability about 
the sensitivity limit. These effects are only important at the low count 
limit, i.e. essentially only at faint fluxes, cf. Georgakakis et al. (2008). 
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Fig. 8. Sky area as a function of flux limit for the 2XMM cat- 
alogue computed for sources with a detection likelihood limit 
L > 10 in the respective energy band. Red curves are for MOS2; 
blue curves are for pn. (MOS 1 is not shown but is very similar 
to MOS 2). 

Top panel: Energy bands 1, 2, 3, 4, & 5 for each camera are 
shown with solid, long-dash, dash-dot, dotted, & dot-dot-dot- 
dashed line styles, respectively. 

Bottom panel: Energy bands 6 & 7 for each camera are shown 
with solid & long-dash styles, respectively. 



a carefully selected subset of the 2XMM fields at high Galactic 
latitudes. 



9.4. False detection rate & likelihood calibration 

The significance of the source detection in the 2XMM cata- 
logue is characterised by the maximum likelihood parameter for 
the detection, L (cf. Sect. 14.4.3b . Although the detection likeli- 
hood values are formally defined in terms of the probability of 
the detection occurring by chance, the complexity of the data 
processing implies that the computed likelihoods need to be 
carefully assessed. To investigate the calibration of the likeli- 
hood values and the expected false detection rate, we thus car- 
ried out realistic Monte-Carlo simulations of the 2XMM cata- 
logue source detection and parameterisation process. The simu- 
lations performed were chosen to represent typical high-latitude 




10"'* 10" 

Flux (erg cm" 2 s" 1 ) 




100 1000 
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Fig. 9. Top: Distribution of point source fluxes for the 2XMM 
catalogue in the soft (red), hard (blue), and total band (green) 
energy bands. The targets of the individual XMM-Newton ob- 
servations are excluded from these distributions (see Sect. IC.lt . 
Detections selected for these distributions have likelihood L > 
10 in the relevant bands. Only sources with summary flag are 
included. Bottom: distribution of total EPIC counts for the same 
sample of 2XMM detections. The red histogram shows the dis- 
tribution if the XMM-Newton targets are included. 



fields without bright sources or extended X-ray emission apart 
from the unresolved cosmic X-ray background. The simula- 
tions include a particle background component and a distribu- 
tion of X-ray point sources with uniform spectral shape drawn 
from a representative extragalactic log N-log S relationship (eg. 
Hasinger et al., 2001). The source spectrum assumed is a power 
law characterised by F = 1.7 with a Galactic column density 
N H = 3 x 10 20 cm" 2 . 

The simulation creates images (and exposure maps etc.) in 
the five standard energy bands using the appropriate calibra- 
tion information (i.e., energy- and position-dependent PSFs, vi- 
gnetting, detection efficiency, etc.). The simulated data are then 
processed with exactly the same steps used in the actual 2XMM 
pipeline (Sect. [U and the derived source parameters, such as 
likelihoods, were compared with the input (i.e., simulated) pa- 
rameters. 

Figure [10] shows the number of false detections per field de- 
rived from the simulations as a function of the minimum L for 
three different exposure times: 12 ks for MOS and 8ks for pn, 
corresponding to around 70% of the median exposure, and three 
and ten times higher exposure values. Also shown is the ex- 
pected false detection number n for an assumed N c - 5,000 
independent detection cells per field, calculated simply as n — 
N c . exp{-L). The value of N c of course depends on the effec- 
tive 'beam-size' for EPIC observations. The value N c = 5,000 
we adopt is based on the area of the search box (20" x 20", 
Sect. 14.4. Tt . corrected downwards to take into account the degra- 
dation and change of shape of the PSF off-axis. This value is a 
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Minumum DETML 

Fig. 10. The number of false detections per field estimated via 
simulations for typical high Galactic latitude fields as a function 
of L m i n for various exposure times. The red circles show the re- 
sults for exposures of 12 ks for MOS and 8ks for pn (~ 70% 
of the median values), whereas the green squares and blue tri- 
angles show those with the exposures of 3 and 10 times higher 
respectively. The dotted line represents the theoretical false de- 
tection number assuming 5,000 independent detection cells per 
field (see text). 



factor ~ 4 times less than would be derived from assuming the 
beam-size is of the order the PSF width (eg. 15" HEW), high- 
lighting that this is a poorly defined quantity. 

The results shown in Fig.[l0]demonstrate: (i) the number of 
false detections per field is low even for L > 6 ; (ii) the depen- 
dence of the number of false detections on L is much flatter than 
simple expectations; (iii) the number of false detections depends 
on the exposure time. 

For typical observations included in the catalogue (repre- 
sented by the red curve in Figure [Tol l, the number of false de- 
tections is ~ [1,0.3,0.1] per field at likelihood limits of L > 
[6,8, 10] respectively. These values increase to ~ [4,2, 1.5] for 
the longest exposure time represented in Figure [10] For each of 
the three exposure times adopted, we also compared the num- 
bers of false detections with the average number of sources de- 
tected in corresponding exposures of typical XMM-Newton high 
Galactic latitude fields, i.e. ~ [60, 100, 200] sources per field, to 
derive false detection rates. We find that these rates have only a 
low dependence on the exposure time, ie. the false detection rate 
is approximately constant at ~ [2, 1,0.5]% for likelihood limits 
L > [6, 8, 10] over the range of exposures investigated. 

Our simulation results can be compared with the analysis 
presented by Brunner et al. (2008), carried out in the context of 
the very deep XMM-Newton observation of the Lockman Hole. 
Their simulations are for a detection approach similar to that 
presented here and their results are also broadly similar (cf. their 
Fig. 4 which shows a qualitatively similar dependence of false 
number with likelihood), albeit they are presented for different 
energy bands. The number of false detections in their simula- 
tions is higher, but of course corresponds to an observation with 
an exposure time ~ 100 times longer. Brunner et al. comment 
that the significant difference between the simulation results and 
simple expectations primarily originates in the multi-step detec- 
tion procedure (which introduces two effective detection thresh- 
olds) and the simultaneous multi-band fitting of source positions 



and fluxes, both of which result in a reduction of the effective 
number of independent trials. The fact that the number of false 
detections depends on the exposure time is not in line with sim- 
ple expectations, but is probably a reflection of a combination 
of Eddington bias and source confusion effects. The much larger 
than expected false detection numbers at L > 12 may arise from 
a too stringent matching criterion between the input and output 
sources in the simulations. Other similar studies of the false de- 
tection rate in XMM-Newton observations include Loaring et 
al. (2005) for the relatively deep XMM-Newton 13 H field and 
Cappelluti et al. (2007) for the COSMOS field. Both studies de- 
termined false detection rates which are somewhat higher than 
our estimates for 2XMM, but these can be reconciled with de- 
tailed differences in the assumptions made in these studies. 

We also investigated the sensitivity of the false detection 
number to the background and to the assumed spectral shape. 
The largest differences are an increase by a factor ~ 2 at the low- 
est likelihoods (L < 8) for background conditions 3 times higher 
than typical. Assuming much softer or harder spectral shapes 
produces a similar increase in the false detection number, again 
restricted to the lowest likelihood bins. 

In addition to the false detection rate and calibration of 
the likelihood values, these simulations also provide a means 
to address the issue of catalogue completeness, ie. the effects 
of Poisson noise which produces a probability distribution for 
source detectability about the sensitivity limit. This study is be- 
yond the scope of the current paper, but we note that complete- 
ness corrections relating to these source detection biases are ex- 
pected to be small except at the lowest fluxes, cf. Georgakakis et 
al. (2008). 

The simulation work also allows us to address the astromet- 
ric performance of the processing. Comparison of the input and 
output positions shows that: (i) there is no measurable average 
offset; (ii) the distribution of position offsets closely follows the 
expected statistical form (cf. Sect. 19.51 , validating the statistical 
position error estimates. This distribution does, however, show 
offsets that are statistically too large for simulated sources with 
position errors < 0.5". The origin of this effect is unclear, al- 
though it may be related to the discrete sampling of the PSF 
representation in the XMM-Newton calibration data. 

Full details of the evaluation of the 2XMM catalogue with 
the simulations will be presented elsewhere (Sakano et al., in 
preparation). 

9.5. Astrometric properties 

In order to investigate the overall astrometric accuracy of the 
2XMM catalogue, in particular the extent to which the posi- 
tion error estimates correctly describe the true positional un- 
certainty, we tested the catalogue positions against the Sloan 
Digital Sky Survey (SDSS) DR5 Quasar Catalog (Schneider et 
al. 2007) which contains 77429 objects classified as quasars by 
their SDSS optical spectra. The sky density of the Sloan quasars 
is ~ 10 per square degree, and their positional accuracy is bet- 
ter than 07 1 , making this an excellent astrometric reference set. 
This approach has the advantage that XMM-Newton is expected 
to detect a large fraction of all Sloan quasars in X-rays (espe- 
cially at the bright magnitude limit for SDSS spectroscopy) and 
thus, a priori, it seems safe to assume that essentially all posi- 
tional matches are actually real associations and that the SDSS 
provides the true celestial position of the object. 

To carry out the analysis, the 2XMM catalogue was cross- 
correlated with the DR5 Quasar Catalog, keeping all matches 
within 20". This produced around 1600 matches, correspond- 
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ing to 1121 unique 2XMM sources. The total sky area for the 
matches (out to 20" radius) was ~ 0.2 deg 2 . Given the sky den- 
sity of Sloan quasars this translates to ~ 2 false matches overall, 
or ~ 0.5 false matches if we use just the inner 10" of the dis- 
tributions. We can thus be confident that the false match rate is 
negligible for this investigation. This is the real advantage of us- 
ing Sloan quasars over other comparison catalogues. 

For the astrometry evaluation a subset of these matches was 
used with detection likelihood L > 8, summary flag 0, off-axis 
angle < 13', and excluding extended sources. These selections 
reduce the total number of detection matches to 1007 (corre- 
sponding to 656 unique sources). 

Figure [TT] shows the distribution of the X-ray /optical posi- 
tion separations for each match for both the corrected and un- 
corrected 2XMM coordinates. As can be seen, the uncorrected 
separations peak at ~ 175 and show a broad distribution out to 
4" - 5", whereas the corrected separations peak at < 1" and 
show a narrower spread. This result of course reflects the overall 
success of the astrometric rectification carried out as part of the 
processing (Sect. l43T l. 

To make a more detailed comparison of the observed and 
expected distributions, we consider the separations normalised 
by the position errors. If we define x - Ar/<x pos where Ar is 
the angular separation and <x pos is the total position error, the 
expected distribution function N(x) takes the form 

N(x)dx oc x e~ A ' dx . 

Thus comparing the empirical N(x) distribution with the ex- 
pected form provides a means to determine the correct <x pos 
value. We expect <x pos to have two components: cr stat , the sta- 
tistical error already determined in the maximum likelihood fit 
(Sect. |4~4~3l ) and a possible additional, residual component, cr sys 
(see comment on nomenclature: footnote [T21 . to take into ac- 
count any residual errors in the position determination and cor- 
rection process, cf. Sect. 14.51 Although it is not completely ob- 
vious how <x stat and cr^s should be combined because the nature 
of the residual error is formally not known, the analysis reported 
here assumes eq. Q, cf. Sect. |8.1| (other assumptions such as the 
linear combination of the errors provide worse fits to the distri- 
butions). 

Figure [TT] (centre) shows the distribution, for corrected 
XMM-Newton coordinates only, of the X-ray/optical position 
separation sigmas (i.e., x = Ar/cr pos ) for the matched detection 
sample assuming <x sys = 0. Although the observed distribution 
is reasonably close to the expected form at low x-values, there is 
a long tail of outliers at x > 3.7 amounting to ~ 8% of the total 
sample, whereas we would expect < 0.1% to lie at x > 3.7. More 
detailed investigation of these outliers shows that they are dom- 
inated by sources with low cr stat -values (mostly < 075), clearly 
indicating the need for an additional component, cr sys , of the or- 
der a/5. 

We investigated a range of possible values of <x sys and found 
that <r sys = 0735 provides the best overall fit between the ob- 
served and expected distributions, as is shown in Fig. [TTJ (bot- 
tom). For this choice of cr sys there are still more outliers at 
large x- values than expected if the position errors were perfectly 
described, but we find that at least some of these can be ex- 
plained on astrophysical grounds (e.g., source confusion, lensed 
objects), so regard our choice as the best overall value to repre- 
sent the global additional error estimate for the catalogue. 

A detailed comparison between the observed and expected 
distributions (Fig. [TTJ shows that there is a deficit of points at 
low x-values and indeed this is true for any cr sys > 0. This indi- 
cates that the true value of the statistical position error, cr stat , is 




Fig. 11. Top: X-ray/optical position separation for each match 
for the corrected (solid histogram) and uncorrected (dashed his- 
togram) XMM-Newton coordinates. Centre: Distribution of sep- 
aration sigma (x) for cr sys - 0. Bottom: Distribution of separa- 
tion sigma (x) for cr^ - 0735. For the centre and bottom plots 
- histogram: separation sigmas; filled histogram: outliers with 
x > 3.7 (and Ar < 10"); smooth curve: expected distribution 
N(x) normalised to fit the peak of the distribution. 



slightly overestimated by the fitting routine. Attempts to model 
this effect with a simple rescaling of the <x stat -value were not 
successful. We note that the typical error estimate of the rectifi- 
cation of the XMM coordinates is ~076 with a spread from 073 
to > 1". This suggests that most of the additional error compo- 
nent needed is related to the rectification residuals, with other 
effects being at a lower level. An obvious alternative approach 
is thus to use the explicit values of the errors determined by the 
rectification algorithm for o- sys (which thus vary from field to 
field and indeed from source to source if the error in the field 
rotation is taken into account) instead of the empirically deter- 
mined - and fixed value - described above. Overall this approach 
gives similar results, but gives x- values which are systematically 
significantly too low, implying the uncertainties derived by the 
rectification algorithm may also be significantly over-estimated 
(by up to 50%). We conclude that using a fixed value of the ad- 
ditional error provides the best empirical description of the data. 
On this basis the value <x sys = 0735 was adopted for the 2XMM 
catalogue. The total position error given in the catalogue, cr pos , 
combines the statistical and additional errors in quadrature, see 
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eq. @. We note that the effect described here may be identi- 
cal to that discovered through the simulation work described in 
Sect. 19.41 If this is the case it would imply that the residual er- 
rors associated with the rectification must indeed be rather lower 
than the formal estimated values overall. 

We repeated the analysis described above for the uncorrected 
XMM coordinates to determine the <x sys -value appropriate to 
those XMM-Newton fields for which astrometric rectification 
was not possible (see Sect. 14.5b . For the uncorrected XMM- 
Newton coordinates we determine a good fit between the ob- 
served and expected distributions for <x sys = l'.'O. This value is 
adopted in the catalogue for sources in those fields for which 
astrometric rectification was not possible. 

For completeness we looked for possible correlations be- 
tween outliers and the obvious XMM-Newton detection param- 
eters (e.g., detection likelihoods, off-axis angle). Rather surpris- 
ingly no clear correlations were found, except with off-axis an- 
gle where it was noted that detections at very high off-axis val- 
ues (> 15') were somewhat more likely to have statistically too 
large separations. By no means all high off-axis detections are 
affected in this way, however. Essentially this means that the sta- 
tistical position error estimates are robust over a very wide range 
of detection parameters and a single additional error component 
provides a very adequate representation of the data. Finally we 
note that properties of the 2XMM/Sloan DR5 Quasar sample are 
reasonably representative of the whole 2XMM catalogue. There 
is a bias towards higher X-ray fluxes and thus lower statistical 
position errors, but a significant number of lower flux objects 
are included and the full range of total counts and likelihoods is 
sampled. 

9.6. Photometric properties 

We have evaluated the flux cross-calibration of the XMM- 
Newton EPIC cameras based on the calibration used to com- 
pute 2XMM fluxes (see Sect. 14.6b . To do this we performed a 
statistical analysis, comparing the fluxes between cameras for 
sources common to both, selected from the entire FOV. The pa- 
rameter used to quantify the difference in flux was defined as 
(Si - S j)/S j, where S ,■ and S j are the fluxes of the sources in 
each pair of cameras (i,f). 

To minimise the impact of other effects, we performed the 
following filtering on the comparison samples: 

1 . We used only point sources, as the uncertainties in the mea- 
sured flux for extended sources are much higher. 

2. We used only sources having at least 250 counts in the en- 
ergy band and for each camera. This requirement was used 
to avoid Eddington bias effects (an increase in the measured 
flux due to statistical fluctuations). 

3. We did not use sources with a 2-12 keV flux 
^ 6 x 10 _12 ergcm _2 s _1 as these objects suffer from pileup 
and therefore their measured flux is underestimated. 

The distributions obtained were fitted with a Gaussian pro- 
file, which in all cases provided a good representation of the 
data. The best-fit mean values obtained from each distribution 
are listed in Table [9] 

There is an excellent agreement in the measured fluxes be- 
tween the two MOS cameras, better that 5% in all 2XMM energy 
bands. The agreement between pn-MOS fluxes is also good, 
better than 10% at energies below 4.5 keV and ~ 10 - 12% 
above 4.5 keV. These flux differences are in broad agreement 
with the results of Stuhlinger et al. (2008) who find a small ex- 
cess, 5 - 10%, of the MOS cameras with respect to pn, using a 



sample of very bright on-axis sources. A more detailed analysis 
will be presented in Mateos et al. (2008). 

9. 7. X-ray hardness (colour) distributions 

For each 2XMM source there are four X-ray hardness ratios (X- 
ray 'colours') which provide a crude representation of the X-ray 
spectrum (cf. Sect. 14.4. 3l for hardness ratio definition). In Fig.fT2l 
we show the hardness ratio density plots for 2XMM catalogue 
sources at high and low Galactic latitudes. These plots are for the 
pn camera hardness ratios only, as they typically are better con- 
strained. Density plots are constructed for sources which have 
detection likelihood L > 8 in the energy bands comprising each 
pair of hardness ratios: this means that the subsample included 
in each plot differs and there is an inevitable bias towards softer 
sources for the HR1 -HR2 plot and to harder sources for the HR3- 
HR4 plot. Imposing the same likelihood threshold for all bands 
would produce a bias towards higher flux sources and in fact 
would restrict this exercise to relatively small samples from the 
whole catalogue. We also exclude sources with summary flag 4; 
a more severe restriction on the flag produces relatively small 
changes to the overall distributions. Overlaid on these hardness 
ratio density plots are spectral tracks for representative simple 
power law and thermal spectral models with a range of absorb- 
ing column densities. 

These density plots provide an excellent statistical character- 
isation of the spectral properties of the catalogue sources, thus 
potentially providing constraints on the overall X-ray popula- 
tion. Although a detailed analysis is beyond the scope of the 
present paper, we comment here on how these match simple ex- 
pectations about the underlying source populations. 

For the high latitude regions of the sky, the density plot is 
dominated by sources with power-law spectra and column densi- 
ties Nh 55 10 22 crrT 2 , as expected from the dominant population 
of AGN. The fraction of AGN in 2XMM with N H > 10 22 crrT 2 
can be seen from these plots to be quite low. The high latitude 
plots also show an extension to much softer hardness ratios. The 
main contributors to this are likely to be coronally active stars 
and non-active galaxies (see comment below about the thermal 
spectra). Due to the bias towards softer (harder) sources in the 
HR1-HR2 (HR3-HR4) plots noted above, the power-law tracks 
overlaid have different indices to approximately match the ob- 
served density distributions. 

At low Galactic latitudes, in contrast, the plots show a more 
complex structure (albeit the sample sizes are smaller). The 
overall low latitude density pattern is consistent with a large 
population of coronally active stars (particularly evident in the 
HR1-HR2 plot) with relatively soft thermal spectra together 
with a significant population of much more absorbed objects: 
background AGN together with distant accreting binaries in the 
Galactic plane (e.g., Hands et al. 2004). Sources with very low- 
temperature thermal spectra (i.e., kT ~ 0.3 keV) are only evident 
as a small component in the HR1-HR2 plot. We note that the 
density peak in the low latitude density plots is not consistent 
with what is expected for a distribution of single-temperature 
thermal spectra with a range of intrinsic temperatures. Instead 
the peak is much better matched by a multi-temperature spec- 
trum which we have here characterised empirically as a com- 
posite three-component model with kT = [0.3, 1,3] keV with 
equal weighting (emission measure) of the three components. 
This finding is broadly consistent with the spectral properties of 
X-ray selected active star samples (e.g., Lopez-Santiago et al. 
2007 and references therein) in which such objects typically are 
best-fit with two-temperature models with kT ~ [0.3, 1] keV. 
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Energy band 


pn-Ml 


Np n _Ml 


pn-M2 


N pn _M2 


M2-M1 


Nm2-M1 


[keV] 


[%] 




[%] 




l%] 




(1) 


(2) 


(3) 


(4) 


(5) 


(6) 


(7) 


0.2- 0.5 


4.9±1.2 


785 


8.4±0.9 


771 


-0.9+0.6 


987 


0.5- 1.0 


-2.4±0.3 


1906 


-2.7+0.2 


1957 


1.0+0.3 


2384 


1.0- 2.0 


-7.6±0.3 


2394 


-8.6±0.3 


2461 


0.6+0.2 


2932 


2.0- 4.5 


-6.1+0.3 


1311 


-5.4±0.3 


1342 


-0.8+0.2 


1552 


4.5 - 12.0 


-12.4±0.7 


387 


-9.5+0.6 


408 


-3.2+0.4 


441 



(1): Energy band definition in keV. (2) and (3): difference (in %) in the measured flux in pn and Ml and number of sources used in the comparison. 
(4) and (5): same as Cols (2) and (3) but for pn and M2. (6) and (7): same as Cols (2) and (3) but for M2 and Ml. 
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Fig. 12. Top row: EPIC pn X-ray hardness ratio density plots for high Galactic latitude (\b\ > 10°) 2XMM sample. Bottom row: 
X-ray hardness ratio density plots for low Galactic latitude (\b\ < 10°) 2XMM sample. Density is displayed on a logarithmic scale 
with a dynamic range of 100. The spectral tracks overlaid are for (i) power-law spectra with r = 1.9, 1.7, 1.4 (blue) for the left, 
middle, and right panels, respectively; (ii) thermal spectrum (APEC model) with kT - 0.3 keV (cyan; HR1-HR2 plot only); (iii) 
a composite thermal model with three components with kT - [0.3, 1,3] keV (green). In each case hardness values are shown for 
N H = [0.03,0.4, 1,5, 10,50] x 10 22 cirT 2 (power-law model) and N H = [0.01,0.05,0.1,0.5, 1] x 10 22 ctrT 2 (thermal models). For 
each spectral track the left-most point marked corresponds to the lowest Nh value, ie. Nh increases towardsthe top right. 



The fact that our hardness density plots are better characterised 
with the ad hoc addition of a third higher temperature component 
clearly points to a harder component being present in a signifi- 
cant number of the objects contributing to the hardness density 
plots. 



9.8. Variability characterisation 

In the whole 2XMM catalogue there are 2307 detections in- 
dicated as variable (cf. Sect. [8]), which relate to 2001 unique 
sources. Evaluation of the frequency distributions of the x 2 ' 
probability, P(x 2 ), from the time-series analysis reveals no sig- 
nificant systematic effects and shows the expected behaviour 
for the parts of the distributions dominated by random noise. 
For example, the frequency distribution of P(x 2 ), as shown in 
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Figs. [T3lb) and c) for the pn (the distributions for MOS1 and 
MOS2 are very similar), is almost constant per unit interval of 
probability down to low probabilities (^ 0.1). Obviously, a non- 
variable set of time-series would have this property across the 
whole probability range 0.0 - 1 .0. 

Figure [l3]a) shows the observed frequency distribution of 
P(X 2 )epic compared with a simulated distribution for a non- 
variable set of time-series. As there are many detections with 
less than the full set of [pn, Ml, M2] time-series, it was nec- 
essary to reproduce this incompleteness in the simulation. The 
numbers of detections with 3, 2, 1, or f(y 2 )-values are: 14917, 
11330, 11917, 156, respectively. The simulation was conducted 
by generating three vectors representing pn, Ml, M2, with each 
element containing a uniform, random number in the range 
0.0 - 1.0. For each element, a check was performed to see 
if there was a valid P(^ 2 )-value for the associated, real cam- 
era data; if not, the random value was set to NULL (so that 
the correct 'run' of valid values was mimicked in the simula- 
tions). These values simulate the expected distribution of [pn, 
Ml, M2] -probabilities for the case of no real variability (see 
Fig.[T"3"1a). As expected, the resulting distributions are 'flat' (on a 
linear scale), as discussed above. A fourth vector was then com- 
puted with the minimum simulated P(x 2 ), i.e., a simulated set 
of P(x 2 )epic = min(P(x 2 ) vn , P(x 2 )mi,P(x 2 )m2) over all available 
values for each detection. 

Visual inspection of samples of time-series flagged as not 
variable, indicated a number of cases and types of variability 
that were likely to have been 'missed' by the 2XMM variability 
test, implying that the catalogue is conservative in this respect. 
These included relatively short-duration increases or decreases, 
and low-level trends/ramps. 

We have compared the fraction of variable sources (or detec- 
tions) to all sources (or detections) having time-series as a func- 
tion of various other parameters of the catalogue. As a function 
of flux (specifically EPIC total-band flux), we find this fraction 
to be ~ 25%, 10%, and 5% for fluxes > 10~ 10 , ~ 10", and 
^ 10~ 12 , respectively. This is broadly as expected as the ability 
to detect variability falls towards lower fluxes. 

We have also carried out an initial evaluation of the variable 
2XMM sources using secure positional matches with objects 
in the Simbad database. From this study we estimate that, for 
serendipitous (i.e., non-target) sources, ~40% are 'normal' (i.e., 
non-degenerate) stars, ~ 5% are X-ray binaries, ~ 3% are cata- 
clysmic variables and ~ 5% are AGNs, plus lower percentages 
of objects such as GRBs. Of order 45% could not be identified 
from Simbad. The above figures relate primarily to the ~ 1000 
sources with quality summary flag values 0-2. Although this is 
not a definitive study as the completeness of Simbad for different 
object types is highly non-uniform, it does nevertheless provide 
confirmation of the utility of the catalogue variability character- 
isation to select known types of variable objects efficiently. 

9.9. Extended sources 

The 2XMM catalogue contains more than 20 000 entries of ex- 
tended detections. The reliable detection and parameterisation 
of extended sources is significantly more demanding than for 
point-like sources because there are many more degrees of free- 
dom in the parameter space. The relatively simple analysis ap- 
proach used in the creation of the catalogue (Sect. |4-.4-.4-| > means 
that the catalogue contains a significant number of extended 
object detections that are either spurious or at least uncertain 
(cf. Sects. [7721 and |7.3l l. The most common causes of problems 



with extended sources are summarised below and illustrated in 
Fig. El 

Spurious detections near bright point sources: These are 
mostly due to inaccuracies of the PSF models, leading to in- 
accurate modelling of the internal background by the source 
fitting routine. 

Confusion of point sources: Pairs or multiples of point sources 
can be detected as one extended source since only up to two 
point sources are modelled simultaneously by the fitting al- 
gorithm. 

Insufficient background subtraction: Some spatial variations 
of the intrinsic background are poorly modelled by the spline 
map. In regions where the background is underestimated, 
spurious detections of extended sources are possible. (In 
many cases the extent parameter of these sources is at or near 
the maximum of the allowed range, 80".) 

Multiple detections of extended sources: The surface bright- 
ness distribution of extended sources is generally more com- 
plex than the fitted yS-model. This can lead to additional de- 
tections in the wings of extended sources. The most extreme 
cases are observations of complex, bright extended sources 
(e.g., Galactic supernova remnants), leading to the detection 
of numerous extended sources in one field. Also, extended 
emission following the fitted y3-model, but with an extent 
greater than the maximum allowed in the fit, tends to be bro- 
ken up into multiple detections. 

Instrumental artefacts: OOT events of piled-up sources, sin- 
gle reflections arcs, and scattered light from the RGA (cf. 
Fig.|4]c) can cause both point-like and extended spurious de- 
tections. 

The catalogue contains extensive detection flags (Sect. |7j 
which can be used to produce much cleaner extended-source 
samples, albeit at the expense of removing some genuine ex- 
tended objects. (This is the case as the flagging scheme only 
provides warnings about generic problems with the analysis or 
the data rather than a specific assessment of the reality of each 
detection.) In particular the automated quality Flags 4, 5, and 
6 (see Table |6]l are set to warn about possible spurious detec- 
tions of extended sources. The combined Flag 7 for extended 
sources is set if one of the Flags 4 - 6 is set. This flag is set for 
9 882 out of 20 837 detections, indicating a potential spurious 
fraction of about 50%. However, the rate of spurious detections 
is distributed very unevenly over the catalogue observations as 
is discussed below. 

Figure Q3] illustrates some of the main features of the ex- 
tended source detections in the catalogue. The plot shows that 
there is, as expected, an overall correlation of extent likelihood 
with EPIC flux. The considerable scatter in the plot has three ori- 
gins: (i) the observations from which the detections are drawn 
have a considerable range of exposure times and background 
values; (ii) source extent: sources with larger spatial extent have 
lower likelihoods at the same integrated flux; (iii) the presence of 
significant numbers of spurious detections. The detections with 
Flag 7 set show, as expected, a broader distribution than those 
without this flag, and a much broader distribution than for the 
detections with 'best' summary flags (i.e., summary flag < 2). 
This is, of course, due to the fact that spurious detections will of- 
ten have implausible likelihoods for the fitted flux or correspond 
to very large source extent which is rare in genuine detections. 

Based on the sample with 'best' summary flags it is clear 
that there are very few reliable extended source detections 
with extent likelihood above ~ 1000 or flux above ~ 4 x 
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Fig. 13. (a) Frequency distribution of P(x 2 )ep\c, with log scales on both axes: solid line - observed; dotted line - simulation for 
random noise, taking into account that there is not always a complete set of 3 camera values for each detection, (b) Frequency 
distribution of P(x 2 ) pn , with log scales on both axes: solid line - observed; dotted line - simulation for random noise, (c) As (b) but 
with linear scales on both axes. 




Fig. 14. Examples of extended source detections. Green circles mark point source detections. In panels (i) - (vi) the magenta and yel- 
low circles mark real and spurious extended detections respectively, plotted with their fitted extent (i.e., core radius, see Sect. l4.4.4l i. 
In panels (vii) & (viii) the yellow ellipses indicate the position of spurious extended source detections. 

Top row: (i) a compact extended source with a small core radius; (ii) a large, low surface brightness extended source at the edge of 
the FOV with low likelihood but high flux (see Fig. [TBI): (iii) an object with a point-like core detected both as a point source as well 
as an extended source; (iv) a clearly extended source with a spurious detection nearby (yellow circle) which is smaller and fainter 
(by a factor of 45) and which therefore does not significantly affect the parameters of the real source. 

Bottom row: (v) a SNR in the LMC where intrinsic structure is detected as point sources (note that the core radius is not represen- 
tative as the extended emission does not follow the/3-model fitted); (vi) a bright extended source with multiple spurious detections 
around the centre: the core radii of these spurious detections are comparable to or greater than the extent of the real source and will 
thus significantly affect the parameters of the real source (note that the maximum core radius allowed in the fitting is 80"); (vii) a 
faint filamentary structure broken up into several extended detections where the parameters have little meaning (due to the circularly 
symmetric nature of the fit); (viii) a crowded region where several point sources are detected as extended due to source confusion 
(the algorithm is restricted to fitting at most two confused sources simultaneously). 



10 13 erg cm 2 s 2 , highlighting the problems that the detection 
algorithm has with bright objects|3 Indeed the majority of reli- 



23 We also note that this is what is expected from the source counts 
of clusters of galaxies which are expected to dominate the extended 
detections, at least at high Galactic latitudes. 



able extended objects in this region of the diagram are the XMM- 
Newton targets themselves (but note that many of these have 
Flag 7 set which would otherwise indicate potentially spurious 
detections). At the highest fluxes a large fraction of the detec- 
tions relate to very bright point-like targets that are incorrectly 
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Fig. 15. Distribution of extent likelihood as a function of total- 
band EPIC flux for the extended source detections in the 2XMM 
catalogue. Red dots are potentially spurious detections with Flag 
7=T, yellow dots are detections with Flag 7=F, black dots are 
the 'best' sample detections with summary flag < 2. Green stars 
mark the targets of the XMM-Newton observations classified as 
extended object types and blue squares targets which are object 
types classified as point-like. The vertical concentrations of tar- 
get points at flux ~3xl0~ n and ~2xlO~ 10 ergcrrT 2 s~ 1 are real, 
being due to multiple detections of two different SNRs used as 
XMM-Newton calibration targets. 



parameterised as being extended due the deficiencies of the fit- 
ting algorithm noted above. 

We have investigated a small subset of the extended detec- 
tions at high Galactic latitudes covered by SDSS DR6 (excluding 
targets). We selected detections with extent likelihood > 100 and 
no warning flags set (i.e., summary flag 0) and evaluated their 
validity by examining the X-ray images visually and by search- 
ing for matches with catalogued objects. We find that less than 
5% of these may be spurious extended source detections, around 
40% are clearly associated with catalogued clusters or groups 
of galaxies and a few percent are associated with single nearby 
galaxies. For a further ~ 30% of the detections we find convinc- 
ing evidence of a previously uncatalogued cluster or group of 
galaxies at the X-ray source location from visual inspection of 
the SDSS DR6 images. These results demonstrate that the over- 
all reliability of the 'best' extended source sample is high, at 
least at higher likelihoods, and that, as expected, the extended 
source sample is dominated by groups and clusters of galaxies. 
We have not carried out a similar exercise systematically at low 
Galactic latitudes but checks of selected detections demonstrate 
the expected associations with SNRs, HII regions, and discrete 
extended features in the Galactic Centre region. 

10. Availability of the catalogue and catalogue 
products 

The 2XMM catalogue table itself is essentially a flat 
file with 246 897 rows and 297 columns (described in 
Appendix |D|. Access to the catalogue file in various 



formats (FITS and comma-separated-variable [CSV]) 
is available from the XMM-SSC catalogues web-page: 
http://xmmssc-www.star.le.ac.uk/Catalogue/ This XMM-SSC 
web-page is the primary location for information about the 
2XMM catalogue. It provides links to the other hosting sites 
and the documentation for the catalogue. It also provides a 
'slimline', reduced volume version of the 2XMM catalogue, 
which is based on the 191 870 unique sources and contains just 
39 columns. The columns in this version are restricted to just 
the merged source quantities, together with the 1XMM and 
2XMMp cross-correlation counterparts. 

Ancillary tables to the catalogue also available from the 
XMM-SSC web-page include the table of observations incorpo- 
rated in the catalogue (AppendixlBli and the target identification 
and classification table (AppendixO. 

Associated with the 2XMM catalogue itself is an extensive 
range of data products such as the EPIC images from each obser- 
vation and the spectra and time-series data described in Sect. [5] 
These products are accessible, along with the catalogue itself, 
from ESA's XMM Science Archive (XSAEJ, the LEDAS0 
(LEicester Database and Archive Service) system and are be- 
ing made available through the Virtual Observatory via LEDAS 
using AstroGrico infrastructure. 

LEDAS also provides access to a single HTML summary 
page for each detected source in the catalogue. These summary 
pages provide the key detection parameters and parameters of 
the corresponding unique source, links to other detections of the 
same source, thumbnail X-ray images and graphical summaries 
of the X-ray time-series and spectral data where these exist. 

The results of the external catalogue cross-correlation car- 
ried out for the 2XMM catalogue (Sect. |6]l are available as 
data products within the XSA and LEDAS or through a ded- 
icated on-line database system hosted by the Observatoire de 
Astrophysique, StrasbourgrJ. 

11. Summary 

We have presented the 2XMM catalogue, described how the cat- 
alogue was produced and discussed the main characteristics of 
the catalogue. Table [TOl provides a summary of its main prop- 
erties, bringing together information presented elsewhere in this 
paper. 

2XMM is the largest X-ray source catalogue ever produced, 
containing almost twice as many discrete sources as either the 
ROSAT survey or ROSAT pointed catalogues. The catalogue 
complements deeper Chandra and XMM-Newton small area sur- 
veys, and probes a large sky area at the flux limit where the bulk 
of the objects that contribute to the X-ray background lie. The 
catalogue has very considerable potential a detailed account of 
which lies outside the scope of this paper. In particular the cat- 
alogue provides a rich resource for generating sizeable, well- 
defined samples for specific studies, utilising the fact that X- 
ray selection is a highly efficient (arguably the most efficient) 
way of selecting certain types of object, notably active galaxies 
(AGN), clusters of galaxies, interacting compact binaries and ac- 
tive stellar coronae. The large sky area covered by the serendip- 
itous survey, or equivalently the large size of the catalogue, also 
means that 2XMM is a major resource for exploring the vari- 
ety of the X-ray source population and identifying rare source 



24 http://xmm.esac.esa.int/xsa/ 

25 http ://w w w. ledas . ac . uk/xmm/2xmmlink. html 

26 http://www.astrogrid.org 

27 http://amwdb.u-strasbg.fr/2xmm/home 
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Table 10. Summary of 2XMM catalogue characteristics 



Energy range (keV; set by EPIC 


cameras) 


0.2-12.0 


Observations 




totaJ 


3491 






pn data 


2674 






MOS1 data 


3384 






MOS2 data 


3394 


Time interval 




Feb 2000- 


-Mar 2007 


Detections 




total 


246897 






total L> 10 


201275 






total sum flag 


199359 






point-like 


226060 






extended 


20837 






with products 


38320 


Unique sources 




totaJ 


191870 






point-like 


173066 






extended 


18804 


Sky area (deg 2 ) 




fotai" 


-560 






netMOSl/2 


-355 






net pn 


-330 


Median exposure time 




MOS1/2 


- 16000 s 


(per observation) 




pn 


- 12500 s 


Flux limit (pn) at ~ 10% 


sky 


0.5-2.0keV 


-2 


coverage (10~ 15 ergcm 


2 s- 1 ) 


2.0-12keV 


-15 






4.5-12keV 


-35 


Flux limit (pn) at ~ 90% 


sky 


0.5-2.0keV 


-10 


coverage (10~ 15 ergcm 


1 s- [ ) 


2.0-12keV 


-90 






4.5-12keV 


-250 


Astrometric accuracy (1 


o-) 


typical 


1.5" 






best* 


0.35" 


Photometric accuracy 




MOS 1/2 comparison 


<5% 






pn/MOS comparison 


< 10% 



a overlaps included 

b limited by systematics 



types. Although the 2XMM catalogue alone provides a power- 
ful way of studying the X-ray source population, matching the 
X-ray data with, e.g., optical catalogues can offer an even more 
effective way to generate considerable samples of particular ob- 
ject types. Projects that exploit some of these characteristics are 
already underway. 

Finally we note that, since the XMM-Newton spacecraft and 
instruments remain in good operational health, we can anticipate 
a substantial growth in the pool of serendipitous X-ray sources 
detected, increasing at a rate of ~35 000 sources/year. With this 
backdrop, further XMM-Newton catalogue releases are planned 
at regular intervals. The first such incremental release is planned 
for August 2008. 
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Appendix A: Summary of XMM-Newton and EPIC 
camera terminology 

On-axis position : the telescope optical axes, defined by the ge- 
ometry of each of the three X-ray mirror modules, is not co- 
incident with geometrical centres of the EPIC detectors. The 
target of the observation is preferentially placed close to, but 
■TN-0OW&Uwp v s.^i | S et from, the optical axis. 

Point-Spread-Function (PSF) : the telescope optics spread X- 
ray photons from a point source into a centrally-peaked dis- 
tribution which is oversampled by the EPIC cameras. The 
PSF is energy-dependent and becomes broader with increas- 
ing (off-axis) angle from the telescope optical axis but also 
suffers a distortion which elongates the profile in the az- 
imuthal direction. 

Event patterns : an X-ray photon incident in a given CCD lo- 
cation causes charge deposition in several surrounding CCD 
pixels, often not symmetrically distributed around the cen- 
tral pixel. Several distinct charge distributions (patterns) are 
recognised as real events by the on-board processing elec- 
tronics for the MOS cameras, whilst this processing takes 
place on the ground for the pn camera. 

Out-Of-Time (OOT) events : EPIC camera exposures are 
composed of many short-duration frames during which the 
recorded events are rapidly read out and processed by the 
on-board electronics. The total time between frames (frame- 
time) depends on the observing mode but is a maximum of 
73 ms for the pn and 2.6 s for the MOS. The cameras are 
shutterless and record data during the readout ('out-of-time') 
as well as the processing ('imaging') period, leading to a 
faint trail of the 'out-of-time' events along the readout direc- 
tion which becomes obvious for bright sources (see Fig.|4j;). 
The percentage of OOT events is a function of the ratio of the 
frame readout time to the frame integration time for a given 
mode. The highest percentage of OOT events at 6.3% is for 
the pn full frame mode, while it never exceeds 0.5% for the 
MOS. 

Pileup : for bright sources, pixels in the core of the PSF can 
receive multiple X-ray photons during an integration frame. 
The on-board processing electronics cannot recognise them 
as distinct events within that frame and either treats them as a 
single event with higher energy or rejects them entirely if the 
resultant pixel pattern of the combined event lies outside the 
pre-defined X-ray pattern library. As a result the recorded 
counts are lower in the core of the source profile, produc- 
ing a flattening or even depression of the source profile (see 
Fig. St). In addition, it has an impact on the spectral profile 
(i.e., a hardening of the spectrum). 

Optical loading : The EPIC cameras (more so the MOS detec- 
tors) are also sensitive to optical photons so that optically 
bright objects generate recordable events in EPIC images 
(Lumb 2000, and references therein). The level of contam- 
ination depends on the filters used and the optical brightness 
of the object. In most observations the filter used is conser- 
vatively selected to minimise this effect. Note that in the case 
of the pn some apparently very soft sources are affected by 
optical loading. 

RGA scattered light : scattering of incident X-rays by the 
RGAs in the two telescope modules that feed the MOS cam- 
eras causes a diffuse bright narrow band in the X-ray images 
which is detectable for bright X-ray sources (see Fig.|4j;). 

Good-Time-Interval (GTI) : data from EPIC camera frames 
can be accepted or rejected according to the state of various 
housekeeping and science parameters, e.g., spacecraft atti- 
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tude stability and particle background level. The 'GTI's are 
the time periods during which the parameter(s) being moni- 
tored are within the acceptable thresholds. 

A more detailed description of the instruments can be found 
in the on-line version of the XMM User Handbook (Ehle et al. 
2007) and on the ESAC documentation web-pages for calibra- 

tiorB 



Appendix B: Observation summary table 

Table ?? presents the observations and exposures included 
in 2XMM and is available on-line at A&Aas well as at the 
XMM_SSC catalogue web-page (cf. Sect. IToT >. The columns in 
this table are as follows. 

Column 1: satellite revolution number (consecutive in time). 

Column 2: observation number (10 digit ID). 

Column 3: ODF version number. 

Column 4 and 5: nominal field Right Ascension and decli- 
nation (J2000) in degrees. 

Column 6: target name (20 characters). 

Column 7: Quality classification of the whole observation 
based on the area flagged as bad in the manual flagging process 
as compared to the whole detection area, see Sect 17.41 means 
nothing has been flagged; 1 indicates that 0% < area < 0.1% of 
the total detection mask has been flagged; 2 indicates that 0.1% 
< area < 1% has been flagged; 3 indicates that 1% < area < 10% 
has been flagged; 4 indicates that 10% < area < 100% has been 
flagged; and 5 means that the whole field was flagged as bad. 

Column 8: number of detections in this field. 

Column 9: number of detections in this field that have not 
received manual Flag 1 1 and are considered to be 'good'. 

Column 10: number of the pn exposures merged for the 
source detection (cf. Sect. 14. It . 

Column 11: filter of the pn exposures: Tnl stands for Thinl, 
Tn2 for Thin2, Med for Medium, and Tck for Thick. 

Column 12: observing mode (cf. Table [TJ of the pn expo- 
sures. 

Column 13: total exposure time of the pn exposures in sec- 
onds. 

Column 14-17: same as columns 10-13 but for MOS1. 

Column 18-21: same as columns 10-13 but for MOS2. 



Appendix C: Target identification and classification 
procedures 

In the following are described the procedures adopted to iden- 
tify and classify the targets of each XMM-Newton observation 
included in the 2XMM catalogue. The results of this exercise are 
available on-line at A&A. 

As any attempt to identify and classify a target is subjective 
and likely to be incomplete (only the investigators of that ob- 
servation know all the details), two different approaches were 
chosen to give the user a choice regarding detail and reliability: 
on the one hand some formal information associated with an ob- 
servation is provided; on the other hand, a manual classification 
scheme tries to supply interpretation of sometimes ambiguous 
target names and to directly identify associated 2XMM detec- 
tions. 



C.1. Formal target identification 

There are three kinds of coordinates associated with each obser- 
vation: 

1. The median of the spacecraft attitude ('pointing direction', 
independent of the instrument) usually points to approxi- 
mately the same position on the detectors and defines best 
the centre of the FOV (this is given in Table ??). 

2. The proposal position refers to the position given by the ob- 
server; this position is placed at a specified detector location 
which depends on the prime instrument (EPIC or RGS) as 
indicated by the observer and which avoids chip gaps, dead 
spots etc, unless an offset is indicated by the investigator. 

3. The XSA gives the coordinates of the prime instrument 
viewing direction which are corrected for the star tracker 
mis-alignment. 

In most cases, the proposal position is the best representa- 
tive of the target object as chosen by the investigators. However, 
there are cases where the actual target object is deliberately off- 
set from the proposal position, or the proposal position is not 
very accurate. The latter can be due to catalogue errors, posi- 
tions with large uncertainties (e.g., gamma ray sources), or an 
error by the observer. In cases where more than one object is the 
target the proposal position can either be located on one of the 
objects or between them. In a few cases, the image was not ob- 
tained at the proposed position due to a slew failure or a 'Target 
of Opportunity' (ToO) observation that was not properly regis- 
tered in the ODF. 

The XSA coordinates are usually near the centre of the field 
and/or the target but do not represent the target position as well 
as the proposal position. 

The target identification table (Appendix IC. 51 ) lists the pro- 
posal and XSA positions together with the proposal category and 
proposal program information as given in the XSA. The latter 
provide a coarse classification of the target as determined by the 
observer. Note though, that the proposal category of calibration 
observations are often meaningless since they are often instru- 
ment related for which there is no particular proposal category. 

C.2. Manual target identification 

In many ways the target name as given in the proposal gives a 
better indication of the field content than the coordinates since 
a target can comprise more than one object or it may be diffuse 
emission that can only be detected in the spectra of background 
objects. In other words, if a target name can be resolved by on- 
line data bases like Simbad and NED one can easily derive more 
information about that object, e.g., object type, other names, or 
references. 

On the other hand, an XMM-Newton target name can be de- 
scriptive or refer to a personal choice of the observer, it can be 
abbreviated, or additional information is added. It was therefore 
necessary to 'interpret' many of the target names before Simbad 
could recognise them. 

The target identification table lists therefore, next to the 
XMM-Newton target name, the best estimate of the Simbad- 
recognisable name where possible (usually very close to the 
given target name), together with the Simbad coordinateo an d 
Simbad object type for classification purposes. In cases where 
Simbad gives more than one object type, the one closest to the 



28 http://xirim2.esac.esa.int/external/xmm_sw_cal/calib/documenta- 
tion/index.shtml 



29 Note that Simbad frequently up-dates its information and the coor- 
dinates given here may be out of date. 
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proposal category was given. Where no Simbad name could be 
identified a NED identification may be given instead, and where 
possible an estimated object type based on the proposal informa- 
tion is given. 

For the use of the catalogue, however, it is most helpful to 
know which and how many sources are 'targets' and therefore 
not serendipitous. The observations are thus classified by their 
field content (i.e., target classification; see Fig. |4]for some ex- 
amples), using the following categorisation: 

- a point or point-like source, that is, a single detection in the 
catalogue (excluding spurious detections); 

- an extended source (the target can be the detection of the 
extended emission as well as point sources associated with 
it, e.g., galaxies in a cluster); 

- a field, that is, all detections are potential targets (e.g., distant 
AGNs); 

- diffuse emission; the detections in such a field are considered 
to be all serendipitous but the location of the field was chosen 
specifically by the observer because of the presence of the - 
often large-scale - diffuse emission; 

or a combination of these. Occasionally the field is totally 
serendipitous due to operational issues. For fields that could not 
be easily classified, the content is 'unknown'. The class of ex- 
tended targets was further divided as follows: 

- small extended source (i.e., well within the FOV) with a ra- 
dius of < 3' (covering roughly 3% of the full FOV), 

- large extended source with a radius of > 3' and often extend- 
ing beyond the FOV, 

- extended source of undetermined radius: these are either not 
detected, not identifiable (more than one object fitted the de- 
scription), or offset and beyond the edge of the FOV. 

In cases where one or two point sources are the target, the 
catalogue detection IDs (for a match within ~ 10") are given as 
well. In cases of extended targets a catalogue detection ID is only 
given if the match is unambiguous and the centre of the extended 
emission well represented by the XMM-Newton detection (the 
parameters of the detection, however, may be unreliable). In a 
few cases a positive identification could be achieved through an- 
other but deeper observation of the same target. 

Because neither the formal nor the manual classification can 
be perfect in every case, the table also lists, for quick refer- 
ence, an indicator for the positions (proposal or Simbad) which 
best represents the target (subject to changes and improvements 
in Simbad). In some cases both positions were deemed to be 
equally viable (e.g., in field observations or large offsets of ex- 
tended objects) and no preference is given in the table. 

C.3. Problem cases 

Not all targets fit unambiguously into the field content classes. In 
a few cases where no decision could be made the target was clas- 
sified as 'unknown'. Otherwise the following guidelines were 
used. 

Galaxies: A galaxy was classified as 'point source' when the 
emission from the (active) nucleus was dominant. It was 
classified as 'extended' when either diffuse emission was ap- 
parent or if the galaxy was large enough for discrete X-ray 
sources in the galaxy to be resolved (in case of doubt a com- 
parison was made with an optical image downloaded from 



the DSSFT) or if the galaxy was detected as a single point 
source in the catalogue but it clearly consisted of several (un- 
resolved) sources. 

In two cases, a 'field' classification was preferred: observa- 
tions of the M31 halo and offset pointings of M33. In both 
cases the galaxy is considerably larger than the FOV. Note 
that the observations of the centre of M3 1 (often called M3 1 
core) are classified as Targe extended' instead since the field 
includes diffuse emission. 

Galaxy clusters: Galaxy clusters usually show X-ray emission 
from the intracluster gas as well as emission from some of 
the galaxies within that cluster. Most galaxy clusters were 
classified as 'large' . Exceptions are distant clusters which are 
significantly smaller than r - 3' and where no point sources 
could be discerned within the diffuse emission. 

Galaxy groups: Galaxy groups have fewer members than 
galaxy clusters. In many cases there is no detectable intra- 
cluster emission and the X-ray images show only emission 
from some of the members. In some cases there is a promi- 
nent galaxy in the centre with a large X-ray halo. Despite this 
diversity it was preferred to classify all groups in the same 
way as galaxy clusters, that is, as extended emission, mixed 
with point or other extended sources. 

Extragalactic point sources: In a few cases a bright X-ray 
source within a galaxy was the target (e.g., 'super Eddington' 
sources); these were treated like AGNs, that is, if no galaxy 
emission could be discerned the target was classified as 
'point source', otherwise as 'extended'. 

Mixed targets: Examples for mixed targets are a particular 
galaxy within a galaxy cluster or a Central Compact Object 
in a SNR. These were classified by the 'larger' target, that is, 
in the examples given the class would be 'extended', while 
the Simbad object type is likely to refer to the point(-like) 
source. There are a number of cases where such a connec- 
tion was not obvious or could not be easily determined (e.g., 
a connection between a quasar and a galaxy cluster which 
may be hosting the quasar or simply be superimposed in the 
line-of-sight) and the class refers to the quoted object only. 
In case of a calibration observation the object is more likely 
to be chosen for its own properties and not for its possible 
connection/interaction with the environment. 

Solar system objects: There are a number of observations of 
planets or comets in our solar system. A special object type, 
'com' for 'comet' and 'pit' for 'planet', is listed for these. 
The field classification depended on what was visible in 
the image, e.g., if there was visible (and detected) diffuse 
emission in case of a comet, or if a planet was observed 
long enough to produce a elongated trace on the image (the 
pipeline processing corrects for any attitude shift so that a 
fixed point in the sky is always at the same location in the 
image). 

C.4. Target classification 

There are 3491 fields in total in the 2XMM catalogue. For 3044 
fields (87%) a Simbad name could be found, and in 53 cases 
(1.5%) a NED identification is given. Of the remaining 394 fields 
only 56 (1.6%) do not have an estimated object type. 

About 10% of the observations were obtained for calibration 
purposes, and 3% are ToO observations. Table IC.l1 lists the dis- 
tribution of the proposal category for 2XMM observations, and 
Table IC.21 gives the same for the field content classes. The ratio 
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Table CI. Proposal category given by the XSA 



Class Description 



Percentage 



I Stars, White Dwarfs and Solar System 

II White Dwarf Binaries, Neutron Star 
Binaries, Cataclysmic Variables, ULXs and 
Black Holes 

III Supernovae, Supernova Remnants, Diffuse 
Emission, and Isolated Neutron Stars 

IV Galaxies and Galactic Surveys 

V Groups of Galaxies, Clusters of Galaxies, 
and Superclusters 

VI Active Galactic Nuclei, Quasars, BL Lac 
Objects, and X-ray Background 

VII X-ray Background and Surveys 



16% 

15% 



14% 

9% 
14% 

23% 

8% 



of point source to extended source to field observation is roughly 
5:3:1. 

For best results on identifying target objects in the catalogue, 
it is recommended to use both the field content class as well as 
the Simbad object type. 

Table C.2. Target/ field content classification 



Class Description 



Percentage 



p point or point-like source 

s small extended (r < 3') 

1 large extended (r > 3') 

e extended source of unknown extent 

f 'field' (all detections are potential targets) 



50% 
10% 
22% 
0.7% 

12% 
2.5% 



x 'X-ray shadow experiment' and similar, 

that is, only the spectra of fore- and back- 
ground objects are of interest (though the 
location of the field should be considered as 
'target') 

t two clearly identified targets (e.g., a double 0.4% 

star) 

n there is no target associated with the field 0.2% 

u unknown target, i.e., the target could not be 2% 
classified or is of unknown nature 



C.5. Target table 

The columns in Table ??, which is available on-line at A&Aas 
well as at the XMM-SSC catalogue web-page (cf. Sect.[10]l, are 
as follows. 

Column 1: satellite revolution number (consecutive in time). 

Column 2: observation number (10 digit ID). 

Column 3: a star indicates if there is a note for this obser- 
vation or for this proposal-ID (first 6 digits of an observation, 
referring to several observations for this proposal) as detailed 
below. 

Column 4: the source number per observation of the identi- 
fied target taken from the column SRCLNUM in the catalogue. 

Column 5: the detection ID of the identified target taken from 
the column DETID in the catalogue. 

Column 6: field classification as described in Table lC.2l 

Column 7: coordinate preference between proposal position 
and Simbad position, depending on which defined the target bet- 



ter; in case of offset positions (usually indicated in the field name 
from the proposal, Col. 12) no preference is given. 

Column 8: proposal category as taken from the XSA as de- 
scribed in Table IC.fl (note that some of the calibration observa- 
tions are not properly classified). 

Column 9: proposal program as taken from the XSA: GO 
stands for Guest Observer, Cal for Calibration, ToO for Targets 
of Opportunity, Cha for Co-Chandra, ESO for Co-ESO, Trig for 
Triggered, and Large. 

Columns 10 and 11: Right Ascension and declination 
(J2000) in degrees as given in the proposal (taken from the 
RA_OBJ and DEC_OBJ keywords in the attitude time-series 
file). 

Column 12: field name as given in the proposal (taken from 
the OBJECT keyword in the calibration index file). 

Column 13 and 14: Right Ascension and declination (J2000) 
in degrees as extracted from Simbad using the Simbad name 
given in Col. 16. 

Column 15: object type as given by Simbad. If no Simbad 
object is given a type was estimated. Additional types not recog- 
nised by Simbad are: XRN for X-ray reflection nebula, sfr for 
star forming region, pit for planet, and com for comet. 

Column 16: modified field name which Simbad recognises 
(and can be used in a script), except for 53 cases that have 
a name recognised by NED (indicated with '[ned]' after the 
name). Modifications include dropping offset indicators, com- 
pleting coordinates, and adjusting the prefix to a recognised con- 
vention as described in Simbad's dictionary of nomenclature. 

Column 17 and 18: Right Ascension and declination (J2000) 
in degrees as given in the XSA; they represent the prime instru- 
ment viewing direction (median value) and are corrected for the 
star tracker mis-alignment. 

A list of observations (10 digits) or proposal-IDs (6 digits) 
in numerical order with special remarks as indicated in Col. 3 of 
the table follows. 

0002740101: CFHT-P1-12 appears to be the name of a CFHT 
plate, and the proposal abstract suggests that this is a field 
observation. 

0002970401: The coordinates of the proposal position and im- 
age do not agree. The Observation Log Browser web-page at 
ESAC refers to an 'earth limb test'. The field of the observa- 
tion is therefore as a whole serendipitous. 

0008820401 : The observation of HD 168 1 12 was replaced by a 
ToO observation of GRB 020321 which, however, was not 
registered in the ODE 

004534: This is a double star but the X-ray detection is not at the 
Simbad position, and the field classification is ambiguous. 

0075940101: Simbad recognises the field name '30 Ari' but re- 
turns two objects (30 Ari A and 30 Ari B). Due to the ambi- 
guity no Simbad name and coordinates are given. 

0093550401: This observation was intended to have Z And as a 
target but due to an operational issue a different position was 
observed. The field of the observation is therefore as a whole 
serendipitous. 

0094360201 : There seems to be an error in the proposal coordi- 
nates in the proposal; the field of the observation is therefore 
as a whole serendipitous. 

0094380101: The observation of 1ES 1255+244 was replaced 
by a ToO observation of GRB 011211 which, however, was 
not registered in the ODE 

0094530401: The observation has a large offset observation 
from3C192. 
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0106860101: There is a source at the proposal position, how- 
ever, it is possibly only a spurious extended detection, and 
therefore no source ID is given. 

010806: The field name is AXAF Ultra Deep Field; this appears 
to be the same as the Hubble Ultra Deep Field with very 
similar coordinates (53.1625, -27.7914). 

0109060201: Ambiguous because target name is not precise 
enough. 

010986: The target name is A 189 but the proposal abstract in- 
dicates that NGC 533 group is the object. It is not obvious 
whether both are target. 

0111520301: This is a ToO observation of GRB 010220, the 
field name as given in the proposal is wrong. 

0112200601: Unclear whether the extended emission around 
the pulsar is connected to it, the field classification is there- 
fore 'unknown'. 

0112200701: Unclear whether the extended emission around 
the pulsar is connected to it, the field classification is there- 
fore 'unknown'. 

0112201101: pulsar is located in SNR W44 (cf. proposal-ID 
008327), and extended emission is detected; the field clas- 
sification is taken to be the same as for proposal-ID 008327. 

011226: The target is a merging galaxy cluster, A399/A401. 
There are four observations in different offset positions. The 
Simbad column lists for each observation the cluster that is 
nearer to the centre of the FOV, where possible. 

011305: The proposal abstract mentions clumpy sources in the 
neighbourhood of pulsars, and the field classification is 
somewhat ambiguous (with respect to actual detections). 

0135960101: The proposal abstract describes the object as X- 
ray reflection nebula. There is no Simbad type for that but it 
seems appropriate to use. 

0141610601: The Simbad position appears to be wrong (the co- 
ordinates in the name were assumed to be B1950 and con- 
verted to J2000 coordinates). 

014363: This is a double star but the X-ray detection is not at the 
Simbad position, and the field classification is ambiguous. 

0149630301: The proposal explains LMC1 to be a supergiant 
shell, while Simbad knows only a symbiotic star named 
LMC1. Instead Simbad knows the supergiant shell as LMC- 
SGS 1. 

0154750401: Both the proposal position and the Simbad posi- 
tion are offset from the identified source. The correct identi- 
fication of this source comes from other observations of the 
same object (proposal-ID 020100). 

0154750301: Though the proposal position and Simbad posi- 
tion are not centred on the source identification given, the 
identification seems unambiguous (note that the Simbad po- 
sition is not very precise which would explain the offset). 

0201270101: The Simbad position appears to be wrong (the co- 
ordinates in the name were assumed to be B1950 and con- 
verted to J2000 coordinates). 

0202940201 : The declination is wrong, the field of the observa- 
tion is therefore as a whole serendipitous. 

0203540901 : From the field name and proposal abstract it is not 
clear whether this is a field or point-source observation. 

0204010101: The target is three point sources. 

020422: The field name is a composite of several target names. 

020619: According to the proposal abstract the target type is an 
X-ray compact source. 

021047: This is an observation of a super-bubble; the field clas- 
sification is ambiguous ('x' or T). 



0303670101: The proposal abstract indicates that this is an ob- 
servation of two galaxy groups, the Simbad name is given 
for the first name only. 

0304050101: It is not clear if this is a point source or a small 
extended source. 



Appendix D: Catalogue columns 

The catalogue contains 297 columns. Each detection was ob- 
served with up to three cameras. For the source detection, the 
total energy range (0.2 - 12keV) was split into five sub-bands 
as well as the XID wide-band (0.5 - 4.5 keV), see Tabled As a 
result, some of the source parameters (like count rates or fluxes) 
are given for each camera and band as well as for the com- 
bined cameras (EPIC) and total band. The column names re- 
flect this by using a two-letter prefix to indicate the camera [ca 
= EP,PN,M1,M2]; in case of parameters that refer to a unique 
source rather than an individual detection (Sect. 18.1b the prefix 
[SC] is used (it stands for 'source'). Following the prefix comes 
an energy band indicator where applicable (b = 1,2,3,4,5,8,9). 
Entries are NULL when there is no detection with the respective 
camera (that is, the detector coverage of the detection weighted 
by the PSF, MASKFRAC, < 0.15). 

In the following, a description for each column is given. The 
name is given in capital letters, the FITS data format in brackets, 
and the unit in square brackets. If the column originates from a 
SAS taslFT. the name of the task follows. 

For easier reference the columns are grouped into seven sec- 
tions. 



D.1. Identification of the detection 

Next to the various identifications, cross matches with the 
1XMM and 2XMMp catalogues are given here. There are 9 
columns in this section. 

DETID (J): A consecutive number which identifies each en- 
try (detection) in the catalogue. 

SRCID (J): A unique number assigned to a group of cata- 
logue entries which are assumed to be the same source. To iden- 
tify members of the same group the distance in arcseconds be- 
tween each pair of sources was compared on the 3cr-level of 
both positional errors. A maximum distance of 7" was assumed, 
which was reduced to 0.9 • DIST_NN (distance to the nearest 
neighbour) where necessary. See Sect. I8.ll for a more detailed 
description. The combined parameters for the unique sources are 
described in Sect. ID.7I 

IAUNAME (21A): The IAU name assigned to the unique 
SRCID. 

SRCJVUM (J), SAS task srcmatch: The (decimal) source 
number in the individual source list for this observation as deter- 
mined during the source fitting stage; in the hexadecimal system 
it identifies the source-specific product files belonging to this de- 
tection. 

MATCH JXMM (21 A): The IAU name of the closest 1XMM 
source within r = 3", cf. Sect. 18. II 

SEP JXMM (E) [arcsec]: The distance between this source 
and the matched 1XMM source, MATCH.IXMM. 

SRCID J.XMMP (J): The unique source ID of the closest 
2XMMp source within r = 3", cf. Sect. ED 

MATCHJ.XMMP (22A): The IAU name of the closest 
2XMMp source, cf. Sect. ED 



31 The documentation on SAS tasks are available through the public 
XMM-SAS distribution from the ESAC web pages. 
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SEPJ2XMMP (E) [arcsec]: The distance between this source 
and the matched 2XMMp source, MATCH_2XMMp. 

D.2. Details of the observation and exposures 

There are 1 1 columns in this section which covers the meta-data 
of a detection. Details on XMM-Newton filters and modes can 
be found in the XMM User Handbook (Ehle et al. 2007). 

OBSJD (10A): The XMM-Newton observation identifica- 
tion. 

REVOLUT(4A) [orbit]: The XMM-Newton revolution num- 
ber. 

MJD.START (D) [d]: Modified Julian Date (i.e., JD- 
2400000.5) of the start of the observation. 

MJDSTOP (D) [d]: Modified Julian Date (i.e., JD- 
2400000.5) of the end of the observation. 

OBSJJIASS (J): Quality classification of the whole obser- 
vation based on the area flagged as bad in the manual flagging 
process as compared to the whole detection area, see Sect 17.41 
means nothing has been flagged; 1 indicates that 0% < area < 
0.1% of the total detection mask has been flagged; 2 indicates 
that 0.1% < area < 1% has been flagged; 3 indicates that 1% 
< area < 10% has been flagged; 4 indicates that 10% < area < 
100% has been flagged; and 5 means that the whole field was 
flagged as bad. 

PN -FILTER (6A): PN filter. The options are Thick, Medium, 
Thinl, and Thin2, indicating the degree of the optical blocking 
desired. 

Ml .FILTER (6A): Ml filter. The options are Thick, Medium, 
and Thinl, indicating the degree of the optical blocking desired. 

M2J'ILTER (6A): Same as Ml JTLTER but for M2. 

PN^UBMODE (23A): PN observing mode. The options are 
full frame mode with the full FOV exposed (in two sub-modes), 
and large window mode with only parts of the FOV exposed 
(Sect. ED- 

M1J1UBMODE (16A): Ml observing mode. The options are 
full frame mode with the full FOV exposed, partial window 
mode with only parts of the central CCD exposed (in different 
sub-modes), and timing mode where the central CCD was not 
exposed ('Fast Uncompressed'), see Sect. 13. II 

M2JSUBMODE (16A): Same as Ml.SUBMODE but for 
M2. 

D.3. Coordinates 

The catalogue lists rectified ('external') equatorial and Galactic 
coordinates as well as uncorrected ('internal') equatorial coordi- 
nates. Two independent error estimates are combined into a third 
error column. There are 9 columns in this section. 

RA (D) [deg], SAS task evalcorr: Corrected Right 
Ascension of the detection (J2000) after statistical correlation of 
the emldetect coordinates, RAJJNC and DECJJNC, with the 
USNO B1.0 optical source catalogue. In case where the cross- 
correlation is determined to be unreliable no correction is ap- 
plied and this value is therefore the same as RAJJNC (Sect. 143] ). 

DEC (D) [deg], SAS task evalcorr: Corrected declina- 
tion of the detection (J2000) after statistical correlation of 
the emldetect coordinates, RAJJNC and DECJJNC, with 
the USNO B1.0 optical source catalogue. In case where the 
cross-correlation is determined to be unreliable no correction 
is applied and this value is therefore the same as DECJJNC 
(Sect. g3]l. 



POSERR (E) [arcsec]: Total position uncertainty calculated 
by combining the statistical error, RADECJ3RR, and the 'sys- 
tematic' error, SYSERR, as follows: 



POSERR 



VRADEC J3RR 2 + SYSERR 2 



LII(D) [deg], SAS task evalcorr: Galactic longitude of the 
detection corresponding to the (corrected) coordinates RA and 
DEC. 

BII (D) [deg], SAS task evalcorr: Galactic latitude of the 
detection corresponding to the (corrected) coordinates RA and 
DEC. 

RADECJiRR (E) [arcsec], SAS task emldetect: Statistical 
lcr-error on the detection position (RAJJNC and DECJJNC). 

SYSERR (E) [arcsec]: The estimated 'systematic' lcr-error 
on the detection position. It is set to be O'.'SS if the SAS task 
eposcorr resulted in a statistically reliable cross-correlation 
with the USNO B1.0 optical catalogue, otherwise the error is 
r/0 (Sect. 1431 

RAJJNC (D) [deg], SAS task emldetect: Right Ascension 
of the source (J2000) as determined by the SAS task emldetect 
by fitting a detection simultaneously in all cameras and energy 
bands (Sect. I4A31 

DECJJNC (D) [deg], SAS task emldetect: Declination of 
the source (J2000) as determined by the SAS task emldetect 
by fitting a detection simultaneously in all cameras and energy 
bands (Sect. I4A3J I. 

DA. Detection parameters 

This section lists 22s3 columns. The fitted and combined detec- 
tion parameters as well as auxiliary information are taken di- 
rectly from the source lists created by the SAS tasks emldetect 
and srcmatch. 

Instead of listing each column, descriptions of the general 
parameter (and their errors) are given followed by an indica- 
tor for which bands and camera combinations this parameter 
is available. Most parameters were determined by the SAS task 
emldetect which is described in detail in Sect. 14.41 while some 
others were derived by the SAS task srcmatch. XID-band pa- 
rameters are derived in a separate emldetect run and are there- 
fore single-band values which ensures a better handling of the 
error values. 

caJ>_FLUX and caJ>_FLUX_ERR: (E) [erg/cm 2 /s], SAS 
tasks emldetect , srcmatch: Fluxes are given for all combi- 
nations of ca = [EP, PN, Ml, M2] and b = [1, 2, 3, 4, 5, 8, 9]; 
they correspond to the flux in the entire PSF and do not need any 
further corrections for PSF losses. 

For the individual cameras, single-band fluxes are calculated 
from the respective band count rate using the filter- and camera- 
dependent energy conversion factors given in Table [4] and cor- 
rected for the dead time due to the read-out phase. These can be 
0.0 if the detection has no counts. The errors are calculated from 
the respective band count rate error using the respective energy 
conversion factors. 

Total-band fluxes and errors for the individual cameras are 
the sum of the fluxes and errors, respectively, from the bands 
1-5. 

The EPIC flux in each band is the mean of the band-specific 
detections in all cameras weighted by the errors, with the error 
on the weighted mean given by 

EP_b_FLUX_ERR = ^1.0/ Y 1 /ca.b JLUX J5RR 2 , 
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where ca = [PN,M1,M2]. 

caJb-RATE and ca_b_RATE_ERR (E) [count/s], SAS task 
emldetect: Count rates and errors are given for all combina- 
tions of ca = [PN, Ml, M2] and b = [1, 2, 3, 4, 5, 8, 9] as well 
forca=[EP]andb = [8, 9]. 

The single-band count rate is the band-dependent source 
counts (see ca_b_CTS) divided by the exposure map, which com- 
bines the mirror vignetting, detector efficiency, bad pixels and 
CCD gaps, and an OOT-factor (Out Of Time) depending on the 
PN modes. The source counts and with it the count rates were 
implicitly background subtracted during the fitting process. They 
correspond to the count rate in the entire PSF and do not need 
any further corrections for PSF losses. Note that rates can be 0.0 
(but not negative) if the source is too faint in the respective band 
to be detectable. 

Total-band count rate for each camera are calculated as the 
sum of the count rates in the individual bands 1-5. 

The EPIC rates are the sum of the camera-specific count rates 
in the respective band. 

caJbjCTS and caJ)jCTS-ERR (E) [count], SAS task 
emldetect: Source counts and errors are given for ca = [EP, 
PN,Ml,M2]andb=[8]. 

The single-band source counts (not given in the catalogue) 
are derived under the total PSF (point spread function) and 
corrected for background. The PSF is fitted on sub-images of 
r - 60" in each band, which means that in most cases at least 
90% of the PSF (if covered by the detector) was effectively used 
in the fit. 

Combined band source counts for each camera are calculated 
as the sum of the source counts in the individual bands 1-5. 

The EPIC counts are the sum of the camera-specific counts. 

The error is the statistical lcr-error on the total source counts 
of the detection. 

caJj-DETML (E), SAS task emldetect: Maximum likeli- 
hoods are derived for all combinations of ca — [PN, Ml, M2] 
and b = [1, 2, 3, 4, 5, 8, 9] as well for ca = [EP] and b = [8, 9]. 

The single-band maximum likelihood values stand for the 
detection likelihood of the source, L - - In P, where P is the 
probability the detection is spurious due to a Poissonian fluctu- 
ation. While the detection likelihood of an extended source is 
computed in the same way, systematic effects such as deviations 
between the real background and the model, have a greater effect 
on extended sources and thus detection likelihoods of extended 
sources are more uncertain. 

To calculate the maximum likelihood values for the total 
band and EPIC the sum of the individual likelihoods is nor- 
malised to two degrees of freedom using the function 

N 

L = -\n(l-P T (^,L')) with L' = Yj L *> 

where Pp is the incomplete Gamma function, N is the number of 
energy bands involved, v is the number of degrees of freedom of 
the fit (v = 3 + N, if the source extent is a fitted parameter, see 
Sect. 14.4.41 and v = 2 + N otherwise). 

EP_EXTENT and EP_EXTENT_ERR (E) [arcsec], SAS task 
emldetect: The extent radius (i.e., core radius) and error of a 
source detected as extended is determined fitting a beta-model 
profile to the source PSF (Sect. 14.4.4b . Anything below 6" is 
considered to be a point source and the extent is re-set to zero. 
To avoid non-converging fitting an upper limit of 80" has been 
introduced. 

EP_EXTENTML (E), SAS task emldetect: The extent like- 
lihood is the likelihood of the detection being extended as given 



by L ext - - In p, where p is the probability of the extent occur- 
ring by chance. 

caJiRn and caJtRn_ERR (E), SAS tasks emldetect, 
srcmatch: The hardness ratios are given for ca = [EP, PN, Ml, 
M2] and n = [1, 2, 3, 4]. They are defined as the ratio between 
the count rates R in bands n and n + 1: 

HR/7 =(R n+l -R n )l(R n+l +R„). 

In the case where the rate in one band is 0.0 (i.e., too faint to be 
detected in this band) the hardness ratio will be -1 or +1 which 
is only a lower or upper limit, respectively. In case where the rate 
in both bands is zero, the hardness ratio is undefined (NULL). 

Errors are the lcr-error on the hardness ratio. 

EPIC hardness ratios are calculated by the SAS task 
srcmatch and are averaged over all three cameras [PN, Ml, 
M2]. Note that no energy conversion factor was used and that 
the EPIC hardness ratios are de facto not hardness ratios but an 
equivalent parameter helpful to characterise the hardness of a 
source. 

caJb-EXP (E) [s], SAS task emldetect: The exposure map 
values are given for combinations of ca — [PN, Ml, M2] and b 
= [1, 2, 3, 4, 5]. They are the PSF-weighted mean of the area of 
the sub-images (r - 60") in the individual-band exposure maps 
(cf.Sect.E3. 

ca_b_BG (E) [count/pixel], SAS task emldetect: The back- 
ground map values are given for combinations of ca = [PN, Ml, 
M2] and b = [1, 2, 3, 4, 5]; they are derived from the back- 
ground maps at the given detection position. Note that the source 
fitting routine uses the background map itself rather than the sin- 
gle value given here. The value is (nearly) zero if the detection 
position lies outside the FOV. 

caJb-VIG (E), SAS task emldetect: The vignetting values 
are given for combinations of ca = [PN, Ml, M2] and b = [1, 2, 
3, 4, 5], They are a function of energy band and off-axis angle. 
Note that the source parametrisation uses the vignetted exposure 
maps instead. 

ca_ONTIME (E) [s]: The ontime values, given for ca = [PN, 
Ml, M2], are the total good exposure time (after GTI filtering) 
of the CCD where the detection is positioned. Note that some 
source positions fall into CCD gaps or outside of the detector 
and will have therefore a NULL given. 

ca.OFFAX(E) [arcmin], SAS task emldetect: The off-axis 
angles, given for ca — [PN, Ml, M2], are the distance between 
the detection position and the on-axio position on the respec- 
tive detector; the off-axis angle for a camera can be greater than 
15' when the detection is located outside the FOV of that camera. 

caMASKFRAC (E), SAS task emldetect: The maskfrac 
values, given for ca = [PN, Ml , M2], are the PSF weighted mean 
of the detector coverage of the detection. It depends slightly on 
energy; only band 8 values are given here which are the mini- 
mum of the energy-dependent maskfrac values. Sources which 
have less than 0. 15 of their PSF covered by the detector are con- 
sidered as being not detected. 

EP_DIST_NN (E) [arcsec], SAS task emldetect: The dis- 
tance to the nearest neighbouring detection; note that there is an 
internal threshold of 6" (before positional fitting) for splitting a 
source into two. 



32 This is the optical axis which is close to but not the same as the 
geometrical centre of the detector. 
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D.5. Detection flags 

This section lists quality flags as well as flags for the presence 
of time-series or spectra for a detection. There are 7 columns in 
this section. 

SUM_FLAG (J): The summary flag of the source is derived 
from the EPIC flag EP_FLAG as explained in detail in Sect. 1731 
They are: 

= good, 

1 = source parameters may be affected, 

2 = possibly spurious, 

3 = located in an area where spurious detection may occur, 

4 = located in an area where spurious detection may occur 
and possibly spurious. 

EP_FLAG (12A), SAS task dpssflag: EPIC flag that 
combines the flags in each camera [PN_FLAG, M1_FLAG, 
M2_FLAG], that is, a flag is set in EP_FLAG if at least one of 
the camera-dependent flags is set. 

PN-FLAG (12A), SAS task dpssflag: PN flag made of the 
flags [1-12] (counted from left to right) for the PN source detec- 
tion. A flag is set to True according to the conditions summarised 
in Sect. I7.3l for the automatic flags and Sect. 17. 41 for the manual 
flags. In case where the camera was not used in the source de- 
tection a dash is given. In case a source was not detected by this 
camera the flags are all set to False (default). Flag [10] is not 
used. 

Ml .FLAG (12A), SAS task dpssflag: Same as PN_FLAG 
but for Ml. 

M2.FLAG (12A), SAS task dpssflag: Same as PN_FLAG 
but for M2. 

TSERIES (L): The flag is set to True if this source has a time- 
series made in at least one exposure (Sect. |5]l. 

SPECTRA (L): The flag is set to True if this source has a 
spectrum made in at least one exposure (Sect.0. 

D.6. Variability information 

This section lists 7 columns with variability information for 
those detections for which time-series were extracted. 

EPJCHI2PROB (E): The minimum value of the avail- 
able camera probabilities [PN.CHI2PROB, M1.CHI2PROB, 
M2_CHI2PROB]. 

PNXHI2PROB (E), SAS task ekstest: The ^-probability 
(based on the null hypothesis) that the source as detected by the 
PN camera is constant. The Pearson's approximation to x 2 for 
Poissonian data was used, in which the model is used as the esti- 
mator of its own variance (Sect. 15.21 ). If more than one exposure 
(that is, time-series) is available for this source the lowest value 
of probability was used. 

M1JCHI2PROB (E), SAS task ekstest: Same as 
PN_CHI2PROB but for Ml . 

M2.CHI2PROB (E), SAS task ekstest: Same as 
PN_CHI2PROB but for M2. 

VAR_FLAG (L): The flag is set to True if this source was 
detected as variable, that is, EPIC y 1 -probability < 10~ 5 (see 
EP.CHI2PROB). 

VAR_EXPJD (4A): If the source is detected as variable (that 
is, if VARJFLAG is set to True), the exposure ID ('S' or 'U' fol- 
lowed by a three-digit number) of the exposure with the lowest 
X 1 -probability is given here. 

VARJNSTJD (2A): If the source is detected as variable (that 
is, if VAR_FLAG is set to True), the instrument ID [PN,M1,M2] 
of the exposure given in VAR J3XPJD is listed here. 



D. 7. Unique source parameters 

This section lists 3 1 columns with combined parameters for the 
unique sources (using the prefix 'SC') together with the total 
number of detections per source. For a detailed description on 
how the detections are matched see Sect. 18.11 

SCJIA (D) [deg]: The mean Right Ascension in degrees 
(J2000) of all the detections of the source SRCID (see RA) 
weighted by the positional errors POSERR. 

SC-DEC (D) [deg]: The mean declination in degrees (J2000) 
of all the detections of the source SRCID (see DEC) weighted 
by the positional errors POSERR. 

SC-POSERR (E) [arcsec]: The error of the weighted mean 
position given in SC_RA and SC_DEC in arcseconds. 

SC-EP-b-FLUX and (E) [erg/cm 2 /s]: The mean band b flux 
of all the detections of the source SRCID (see EP_b_FLUX) 
weighted by the errors (EP_b_FLUX_ERR), where b = 
[1,2,3,4,5,8,9]. 

SC_EPJb_FLUX_ERR (E) [erg/cm 2 /s]: Error on the weighted 
mean band b flux in SC_EP_b_FLUX, where b = [1,2,3,4,5,8,9]. 

SCMRn (E): The mean hardness ratio of the bands n and 
n + 1 of all the detections of the source SRCID (see EP_HRn) 
weighted by the errors (see EP_HRn_ERR), where n = [1,2, 3, 

4]. 

SCJiRn_ERR (E): Error on the weighted mean hardness ra- 
tio in SC_HRn. 

SC-DETML (E): The total-band detection likelihood of the 
source SRCID is the maximum of the likelihoods of all detec- 
tions of this source (see EP_8_DET_ML). 

SC-EXTJAL (E): The total-band detection likelihood of the 
extended source SRCID is the average of the extent likelihoods 
of all detections of this source (see EP_EXTENTJVIL). 

SCJCHI2PROB (E): The x 1 -probability (based on the null 
hypothesis) that the unique source SRCID as detected by any 
of the observations is constant, that is, the minimum value of 
the EPIC probabilities in each detection (see EP_CHI2PROB) is 
given. 

SC-VAR_FLAG (L): The variability flag for the unique source 
SRCID is set to VAR_FLAG of the most variable detection of 
this source. 

SCJiUM_FLAG (J): The summary flag for the unique source 
SRCID is taken to be the maximum flag of all detections of this 
source (see SUM_FLAG). 

NJ)ETECTIONS (J): The number of detections of the 
unique source SRCID used to derive the combined values. 
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